commit 30ad2851a645bb5f42c72f21ceb166877cf7e695 Author: Sasha Levin Date: Mon Jan 22 12:40:06 2018 -0800 Linux 4.1.49 Signed-off-by: Sasha Levin commit 623dfab42becf5c56c9a31b7eaf90cb6eb86459f Author: Long Li Date: Wed Jan 10 13:21:29 2018 -0700 storvsc: do not assume SG list is continuous when doing bounce buffers The original patch was made for stable 4.1 and was Acked on 08/22/2017, but for some reason it never made it to the stable tree. Change from v1: Changed comment that this patch is for linux-stable 4.1 and all prior stable kernels. storvsc checks the SG list for gaps before passing them to Hyper-v device. If there are gaps, data is copied to a bounce buffer and a continuous data buffer is passed to Hyper-V. The check on gaps assumes SG list is continuous, and not chained. This is not always true. Failing the check may result in incorrect I/O data passed to the Hyper-v device. This code path is not used post Linux 4.1. [LL: Backport for 4.1] Signed-off-by: Long Li Acked-by: Martin K. Petersen Signed-off-by: Sasha Levin commit feedfe2129e348e075b72a5e3e1c7c52d0c769e4 Author: Martin Kepplinger Date: Fri Nov 6 16:31:02 2015 -0800 bitops.h: add sign_extend64() [ Upstream commit 48e203e21b29cd4b2c58403fe8bca68e2e854895 ] Months back, this was discussed, see https://lkml.org/lkml/2015/1/18/289 The result was the 64-bit version being "likely fine", "valuable" and "correct". The discussion fell asleep but since there are possible users, let's add it. Signed-off-by: Martin Kepplinger Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Thomas Gleixner Cc: "H. Peter Anvin" Cc: George Spelvin Cc: Rasmus Villemoes Cc: Maxime Coquelin Cc: Denys Vlasenko Cc: Yury Norov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit ba00cc7686f6df5ca08a2a40d5c80ef744368d23 Author: Nicholas Krause Date: Sat Jul 4 16:34:32 2015 -0400 gpio: 74xx: Fix build warning about void to integer cast [ Upstream commit 54442658d8e83b0589731620bd958cc8b2857167 ] This fixes the build warning , warning: cast from pointer to integer of different size when building this file on a x86 allmodconfig configuration. In order for me to fix this build warning I changed the cast in the function mmio_74xx_gpio_probe from casting the variable data of the stucture pointer of_id to uintptr_t rather then unsigned when assigning to the variable flag of the structure pointer priv of the structure type mmio_74xx_gpio_priv. Signed-off-by: Nicholas Krause Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin commit 6f2248686a9b20ad8c9bea6126b2495d7f8048e7 Author: Arnd Bergmann Date: Fri Jul 1 18:02:22 2016 +0100 ARM: 8584/1: floppy: avoid gcc-6 warning [ Upstream commit dd665be0e243873343a28e18f9f345927b658daf ] gcc-6.0 warns about comparisons between two identical expressions, which is what we get in the floppy driver when writing to the FD_DOR register: drivers/block/floppy.c: In function 'set_dor': drivers/block/floppy.c:810:44: error: self-comparison always evaluates to true [-Werror=tautological-compare] fd_outb(newdor, FD_DOR); It would be nice to use a static inline function instead of the macro, to avoid the warning, but we cannot do that because the FD_DOR definition is incomplete at this point. Adding a cast to (u32) is a harmless way to shut up the warning, just not very nice. Signed-off-by: Arnd Bergmann Signed-off-by: Russell King Signed-off-by: Sasha Levin commit 74b6cfd76efd9787afc57dd8983881911b8806ec Author: Arnd Bergmann Date: Wed Nov 16 16:20:37 2016 +0100 ARM: ux500: fix prcmu_is_cpu_in_wfi() calculation [ Upstream commit f0e8faa7a5e894b0fc99d24be1b18685a92ea466 ] This function clearly never worked and always returns true, as pointed out by gcc-7: arch/arm/mach-ux500/pm.c: In function 'prcmu_is_cpu_in_wfi': arch/arm/mach-ux500/pm.c:137:212: error: ?: using integer constants in boolean context, the expression will always evaluate to 'true' [-Werror=int-in-bool-context] With the added braces, the condition actually makes sense. Fixes: 34fe6f107eab ("mfd : Check if the other db8500 core is in WFI") Signed-off-by: Arnd Bergmann Acked-by: Daniel Lezcano Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin commit cef20adbe64cd8550ede8d551f8f6ebcf805d5ef Author: Linus Torvalds Date: Wed Jul 12 19:25:47 2017 -0700 disable new gcc-7.1.1 warnings for now [ Upstream commit bd664f6b3e376a8ef4990f87d08271cc2d01ba9a ] I made the mistake of upgrading my desktop to the new Fedora 26 that comes with gcc-7.1.1. There's nothing wrong per se that I've noticed, but I now have 1500 lines of warnings, mostly from the new format-truncation warning triggering all over the tree. We use 'snprintf()' and friends in a lot of places, and often know that the numbers are fairly small (ie a controller index or similar), but gcc doesn't know that, and sees an 'int', and thinks that it could be some huge number. And then complains when our buffers are not able to fit the name for the ten millionth controller. These warnings aren't necessarily bad per se, and we probably want to look through them subsystem by subsystem, but at least during the merge window they just mean that I can't even see if somebody is introducing any *real* problems when I pull. So warnings disabled for now. Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 3c59dd4fae37154386b4a04543b137e054fba04a Author: Arnd Bergmann Date: Thu Nov 24 17:26:22 2016 +0100 irda: fix overly long udelay() [ Upstream commit c9bd28233b6d0d82ac3ba0215723be0a8262c39c ] irda_get_mtt() returns a hardcoded '10000' in some cases, and with gcc-7, we get a build error because this triggers a compile-time check in udelay(): drivers/net/irda/w83977af_ir.o: In function `w83977af_hard_xmit': w83977af_ir.c:(.text.w83977af_hard_xmit+0x14c): undefined reference to `__bad_udelay' Older compilers did not run into this because they either did not completely inline the irda_get_mtt() or did not consider the 10000 value a constant expression. The code has been wrong since the start of git history. Signed-off-by: Arnd Bergmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit eaedee951a16779f8056e50c799ff7e6f455c953 Author: Martin Liska Date: Fri May 12 15:46:35 2017 -0700 gcov: support GCC 7.1 [ Upstream commit 05384213436ab690c46d9dfec706b80ef8d671ab ] Starting from GCC 7.1, __gcov_exit is a new symbol expected to be implemented in a profiling runtime. [akpm@linux-foundation.org: coding-style fixes] [mliska@suse.cz: v2] Link: http://lkml.kernel.org/r/e63a3c59-0149-c97e-4084-20ca8f146b26@suse.cz Link: http://lkml.kernel.org/r/8c4084fa-3885-29fe-5fc4-0d4ca199c785@suse.cz Signed-off-by: Martin Liska Acked-by: Peter Oberparleiter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 8ec347fc139013501ba175fa9760b3b79796da7f Author: Florian Meier Date: Thu Jul 14 12:07:26 2016 -0700 gcov: add support for gcc version >= 6 [ Upstream commit d02038f972538b93011d78c068f44514fbde0a8c ] Link: http://lkml.kernel.org/r/20160701130914.GA23225@styxhp Signed-off-by: Florian Meier Reviewed-by: Peter Oberparleiter Tested-by: Peter Oberparleiter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 4862c0509c9d20d610da6a15ac9ce86de3dc5ba6 Author: Lorenzo Stoakes Date: Tue Jun 30 14:57:49 2015 -0700 gcov: add support for GCC 5.1 [ Upstream commit 3e44c471a2dab210f7e9b1e5f7d4d54d52df59eb ] Fix kernel gcov support for GCC 5.1. Similar to commit a992bf836f9 ("gcov: add support for GCC 4.9"), this patch takes into account the existence of a new gcov counter (see gcc's gcc/gcov-counter.def.) Firstly, it increments GCOV_COUNTERS (to 10), which makes the data structure struct gcov_info compatible with GCC 5.1. Secondly, a corresponding counter function __gcov_merge_icall_topn (Top N value tracking for indirect calls) is included in base.c with the other gcov counters unused for kernel profiling. Signed-off-by: Lorenzo Stoakes Cc: Andrey Ryabinin Cc: Yuan Pengfei Tested-by: Peter Oberparleiter Reviewed-by: Peter Oberparleiter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit daa876c7e8b4f66d7fceb19eb1959f47ea103bff Author: Arnd Bergmann Date: Thu Nov 19 11:42:26 2015 +0100 net: tulip: turn compile-time warning into dev_warn() [ Upstream commit de92718883ddbcd11b738d36ffcf57617b97fa12 ] The tulip driver causes annoying build-time warnings for allmodconfig builds for all recent architectures: dec/tulip/winbond-840.c:910:2: warning: #warning Processor architecture undefined dec/tulip/tulip_core.c:101:2: warning: #warning Processor architecture undefined! This is the last remaining warning for arm64, and I'd like to get rid of it. We don't really know the cache line size, architecturally it would be at least 16 bytes, but all implementations I found have 64 or 128 bytes. Configuring tulip for 32-byte lines as we do on ARM32 seems to be the safe but slow default, and nobody who cares about performance these days would use a tulip chip anyway, so we can just use that. To save the next person the job of trying to find out what this is for and picking a default for their architecture just to kill off the warning, I'm now removing the preprocessor #warning and turning it into a pr_warn or dev_warn that prints the equivalent information when the driver gets loaded. Signed-off-by: Arnd Bergmann Acked-by: Grant Grundler Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit aef3297a50e7fd666a817aba566537fceeee0f1a Author: Miaoqing Pan Date: Wed Sep 27 09:13:34 2017 +0800 ath9k: fix tx99 potential info leak [ Upstream commit ee0a47186e2fa9aa1c56cadcea470ca0ba8c8692 ] When the user sets count to zero the string buffer would remain completely uninitialized which causes the kernel to parse its own stack data, potentially leading to an info leak. In addition to that, the string might be not terminated properly when the user data does not contain a 0-terminator. Signed-off-by: Miaoqing Pan Reviewed-by: Christoph Böhmwalder Signed-off-by: Kalle Valo Signed-off-by: Sasha Levin commit 411934395be9e8074f08876c4b9fcbfee2574f43 Author: Alex Vesker Date: Tue Oct 10 10:36:41 2017 +0300 IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop [ Upstream commit b4b678b06f6eef18bff44a338c01870234db0bc9 ] When ndo_open and ndo_stop are called RTNL lock should be held. In this specific case ipoib_ib_dev_open calls the offloaded ndo_open which re-sets the number of TX queue assuming RTNL lock is held. Since RTNL lock is not held, RTNL assert will fail. Signed-off-by: Alex Vesker Signed-off-by: Sasha Levin commit 1e6e40c8b310566b8d49c272be4fe1fe550409ff Author: Alexander Duyck Date: Fri Oct 13 13:40:24 2017 -0700 macvlan: Only deliver one copy of the frame to the macvlan interface [ Upstream commit dd6b9c2c332b40f142740d1b11fb77c653ff98ea ] This patch intoduces a slight adjustment for macvlan to address the fact that in source mode I was seeing two copies of any packet addressed to the macvlan interface being delivered where there should have been only one. The issue appears to be that one copy was delivered based on the source MAC address and then the second copy was being delivered based on the destination MAC address. To fix it I am just treating a unicast address match as though it is not a match since source based macvlan isn't supposed to be matching based on the destination MAC anyway. Fixes: 79cf79abce71 ("macvlan: add source mode") Signed-off-by: Alexander Duyck Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 89e819fc3f468e6a539596c24146d014fa9078c6 Author: Jan Kara Date: Mon Oct 16 11:38:11 2017 +0200 udf: Avoid overflow when session starts at large offset [ Upstream commit abdc0eb06964fe1d2fea6dd1391b734d0590365d ] When session starts beyond offset 2^31 the arithmetics in udf_check_vsd() would overflow. Make sure the computation is done in large enough type. Reported-by: Cezary Sliwa Signed-off-by: Jan Kara Signed-off-by: Sasha Levin commit de6513563f35dead300c82411c174d8834b84bb3 Author: Dan Carpenter Date: Wed Oct 4 10:50:37 2017 +0300 scsi: bfa: integer overflow in debugfs [ Upstream commit 3e351275655d3c84dc28abf170def9786db5176d ] We could allocate less memory than intended because we do: bfad->regdata = kzalloc(len << 2, GFP_KERNEL); The shift can overflow leading to a crash. This is debugfs code so the impact is very small. I fixed the network version of this in March with commit 13e2d5187f6b ("bna: integer overflow bug in debugfs"). Fixes: ab2a9ba189e8 ("[SCSI] bfa: add debugfs support") Signed-off-by: Dan Carpenter Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 67cd91a587fa5bf071406a1ed91a9ed13b480973 Author: Jia-Ju Bai Date: Mon Oct 9 16:45:55 2017 +0800 vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend [ Upstream commit 42c8eb3f6e15367981b274cb79ee4657e2c6949d ] The driver may sleep under a spinlock, and the function call path is: vt6655_suspend (acquire the spinlock) pci_set_power_state __pci_start_power_transition (drivers/pci/pci.c) msleep --> may sleep To fix it, pci_set_power_state is called without having a spinlock. This bug is found by my static analysis tool and my code review. Signed-off-by: Jia-Ju Bai Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 911c1eaea477b8c4938e851b5720044e4e323505 Author: Kurt Garloff Date: Tue Oct 17 09:10:45 2017 +0200 scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry [ Upstream commit 909cf3e16a5274fe2127cf3cea5c8dba77b2c412 ] All EMC SYMMETRIX support REPORT_LUNS, even if configured to report SCSI-2 for whatever reason. Signed-off-by: Kurt Garloff Signed-off-by: Hannes Reinecke Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 4695854c19d6831928e2467b9f7a375bbd6d4f7b Author: NeilBrown Date: Tue Oct 17 16:18:36 2017 +1100 raid5: Set R5_Expanded on parity devices as well as data. [ Upstream commit 235b6003fb28f0dd8e7ed8fbdb088bb548291766 ] When reshaping a fully degraded raid5/raid6 to a larger nubmer of devices, the new device(s) are not in-sync and so that can make the newly grown stripe appear to be "failed". To avoid this, we set the R5_Expanded flag to say "Even though this device is not fully in-sync, this block is safe so don't treat the device as failed for this stripe". This flag is set for data devices, not not for parity devices. Consequently, if you have a RAID6 with two devices that are partly recovered and a spare, and start a reshape to include the spare, then when the reshape gets past the point where the recovery was up to, it will think the stripes are failed and will get into an infinite loop, failing to make progress. So when contructing parity on an EXPAND_READY stripe, set R5_Expanded. Reported-by: Curt Signed-off-by: NeilBrown Signed-off-by: Shaohua Li Signed-off-by: Sasha Levin commit 50cad28ec97ea9a0a6ed9018c24384dee57fb8e3 Author: Linus Walleij Date: Wed Oct 11 11:57:15 2017 +0200 pinctrl: adi2: Fix Kconfig build problem [ Upstream commit 1c363531dd814dc4fe10865722bf6b0f72ce4673 ] The build robot is complaining on Blackfin: drivers/pinctrl/pinctrl-adi2.c: In function 'port_setup': >> drivers/pinctrl/pinctrl-adi2.c:221:21: error: dereferencing pointer to incomplete type 'struct gpio_port_t' writew(readw(®s->port_fer) & ~BIT(offset), ^~ drivers/pinctrl/pinctrl-adi2.c: In function 'adi_gpio_ack_irq': >> drivers/pinctrl/pinctrl-adi2.c:266:18: error: dereferencing pointer to incomplete type 'struct bfin_pint_regs' if (readl(®s->invert_set) & pintbit) ^~ It seems the driver need to include and to compile. The Blackfin architecture was re-defining the Kconfig PINCTRL symbol which is not OK, so replaced this with PINCTRL_BLACKFIN_ADI2 which selects PINCTRL and PINCTRL_ADI2 just like most arches do. Further, the old GPIO driver symbol GPIO_ADI was possible to select at the same time as selecting PINCTRL. This was not working because the arch-local header contains an explicit #ifndef PINCTRL clause making compilation break if you combine them. The same is true for DEBUG_MMRS. Make sure the ADI2 pinctrl driver is not selected at the same time as the old GPIO implementation. (This should be converted to use gpiolib or pincontrol and move to drivers/...) Also make sure the old GPIO_ADI driver or DEBUG_MMRS is not selected at the same time as the new PINCTRL implementation, and only make PINCTRL_ADI2 selectable for the Blackfin families that actually have it. This way it is still possible to add e.g. I2C-based pin control expanders on the Blackfin. Cc: Steven Miao Cc: Huanhuan Feng Signed-off-by: Linus Walleij Signed-off-by: Sasha Levin commit b4aad4ba23e9901f51e42dd30e552f91c40d63c5 Author: Bin Liu Date: Tue Dec 5 08:45:30 2017 -0600 usb: musb: da8xx: fix babble condition handling [ Upstream commit bd3486ded7a0c313a6575343e6c2b21d14476645 ] When babble condition happens, the musb controller might automatically turns off VBUS. On DA8xx platform, the controller generates drvvbus interrupt for turning off VBUS along with the babble interrupt. In this case, we should handle the babble interrupt first and recover from the babble condition. This change ignores the drvvbus interrupt if babble interrupt is also generated at the same time, so the babble recovery routine works properly. Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Bin Liu Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 817c8706ce04e19209ffa05c68b038e3c56143ec Author: nixiaoming Date: Fri Sep 15 17:45:56 2017 +0800 tty fix oops when rmmod 8250 [ Upstream commit c79dde629d2027ca80329c62854a7635e623d527 ] After rmmod 8250.ko tty_kref_put starts kwork (release_one_tty) to release proc interface oops when accessing driver->driver_name in proc_tty_unregister_driver Use jprobe, found driver->driver_name point to 8250.ko static static struct uart_driver serial8250_reg .driver_name= serial, Use name in proc_dir_entry instead of driver->driver_name to fix oops test on linux 4.1.12: BUG: unable to handle kernel paging request at ffffffffa01979de IP: [] strchr+0x0/0x30 PGD 1a0d067 PUD 1a0e063 PMD 851c1f067 PTE 0 Oops: 0000 [#1] PREEMPT SMP Modules linked in: ... ... [last unloaded: 8250] CPU: 7 PID: 116 Comm: kworker/7:1 Tainted: G O 4.1.12 #1 Hardware name: Insyde RiverForest/Type2 - Board Product Name1, BIOS NE5KV904 12/21/2015 Workqueue: events release_one_tty task: ffff88085b684960 ti: ffff880852884000 task.ti: ffff880852884000 RIP: 0010:[] [] strchr+0x0/0x30 RSP: 0018:ffff880852887c90 EFLAGS: 00010282 RAX: ffffffff81a5eca0 RBX: ffffffffa01979de RCX: 0000000000000004 RDX: ffff880852887d10 RSI: 000000000000002f RDI: ffffffffa01979de RBP: ffff880852887cd8 R08: 0000000000000000 R09: ffff88085f5d94d0 R10: 0000000000000195 R11: 0000000000000000 R12: ffffffffa01979de R13: ffff880852887d00 R14: ffffffffa01979de R15: ffff88085f02e840 FS: 0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa01979de CR3: 0000000001a0c000 CR4: 00000000001406e0 Stack: ffffffff812349b1 ffff880852887cb8 ffff880852887d10 ffff88085f5cd6c2 ffff880852800a80 ffffffffa01979de ffff880852800a84 0000000000000010 ffff88085bb28bd8 ffff880852887d38 ffffffff812354f0 ffff880852887d08 Call Trace: [] ? __xlate_proc_name+0x71/0xd0 [] remove_proc_entry+0x40/0x180 [] ? _raw_spin_lock_irqsave+0x41/0x60 [] ? destruct_tty_driver+0x60/0xe0 [] proc_tty_unregister_driver+0x28/0x40 [] destruct_tty_driver+0x88/0xe0 [] tty_driver_kref_put+0x1d/0x20 [] release_one_tty+0x5a/0xd0 [] process_one_work+0x139/0x420 [] worker_thread+0x121/0x450 [] ? process_scheduled_works+0x40/0x40 [] kthread+0xec/0x110 [] ? tg_rt_schedulable+0x210/0x220 [] ? kthread_freezable_should_stop+0x80/0x80 [] ret_from_fork+0x42/0x70 [] ? kthread_freezable_should_stop+0x80/0x80 Signed-off-by: nixiaoming Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit eae9baa0fb770a801472d0b19604c30c5ab64451 Author: Michael Ellerman Date: Mon Oct 9 21:52:44 2017 +1100 powerpc/perf/hv-24x7: Fix incorrect comparison in memord [ Upstream commit 05c14c03138532a3cb2aa29c2960445c8753343b ] In the hv-24x7 code there is a function memord() which tries to implement a sort function return -1, 0, 1. However one of the conditions is incorrect, such that it can never be true, because we will have already returned. I don't believe there is a bug in practice though, because the comparisons are an optimisation prior to calling memcmp(). Fix it by swapping the second comparision, so it can be true. Reported-by: David Binderman Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit e9d85866c84c8ea47b04185a8343823962b1fc6b Author: Alex Williamson Date: Wed Oct 11 15:35:56 2017 -0600 PCI: Detach driver before procfs & sysfs teardown on device remove [ Upstream commit 16b6c8bb687cc3bec914de09061fcb8411951fda ] When removing a device, for example a VF being removed due to SR-IOV teardown, a "soft" hot-unplug via 'echo 1 > remove' in sysfs, or an actual hot-unplug, we first remove the procfs and sysfs attributes for the device before attempting to release the device from any driver bound to it. Unbinding the driver from the device can take time. The device might need to write out data or it might be actively in use. If it's in use by userspace through a vfio driver, the unbind might block until the user releases the device. This leads to a potentially non-trivial amount of time where the device exists, but we've torn down the interfaces that userspace uses to examine devices, for instance lspci might generate this sort of error: pcilib: Cannot open /sys/bus/pci/devices/0000:01:0a.3/config lspci: Unable to read the standard configuration space header of device 0000:01:0a.3 We don't seem to have any dependence on this teardown ordering in the kernel, so let's unbind the driver first, which is also more symmetric with the instantiation of the device in pci_bus_add_device(). Signed-off-by: Alex Williamson Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin commit dc8736a8624ecf22090ff23fa1532fbe5252f3f9 Author: Christoph Hellwig Date: Tue Oct 17 14:16:19 2017 -0700 xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real [ Upstream commit 5e422f5e4fd71d18bc6b851eeb3864477b3d842e ] There was one spot in xfs_bmap_add_extent_unwritten_real that didn't use the passed in new extent state but always converted to normal, leading to wrong behavior when converting from normal to unwritten. Only found by code inspection, it seems like this code path to move partial extent from written to unwritten while merging it with the next extent is rarely exercised. Signed-off-by: Christoph Hellwig Reviewed-by: Brian Foster Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Sasha Levin commit c53b1162f90c24e3a308d0cd885edc9b0c141f60 Author: Brian Foster Date: Thu Oct 26 09:31:16 2017 -0700 xfs: fix log block underflow during recovery cycle verification [ Upstream commit 9f2a4505800607e537e9dd9dea4f55c4b0c30c7a ] It is possible for mkfs to format very small filesystems with too small of an internal log with respect to the various minimum size and block count requirements. If this occurs when the log happens to be smaller than the scan window used for cycle verification and the scan wraps the end of the log, the start_blk calculation in xlog_find_head() underflows and leads to an attempt to scan an invalid range of log blocks. This results in log recovery failure and a failed mount. Since there may be filesystems out in the wild with this kind of geometry, we cannot simply refuse to mount. Instead, cap the scan window for cycle verification to the size of the physical log. This ensures that the cycle verification proceeds as expected when the scan wraps the end of the log. Reported-by: Zorro Lang Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Sasha Levin commit 5a260ef67e55cc6dce5c39acecc2cbfb0ef3150d Author: Jiri Slaby Date: Wed Oct 25 15:57:55 2017 +0200 l2tp: cleanup l2tp_tunnel_delete calls [ Upstream commit 4dc12ffeaeac939097a3f55c881d3dc3523dff0c ] l2tp_tunnel_delete does not return anything since commit 62b982eeb458 ("l2tp: fix race condition in l2tp_tunnel_delete"). But call sites of l2tp_tunnel_delete still do casts to void to avoid unused return value warnings. Kill these now useless casts. Signed-off-by: Jiri Slaby Cc: Sabrina Dubroca Cc: Guillaume Nault Cc: David S. Miller Cc: netdev@vger.kernel.org Acked-by: Guillaume Nault Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit d2573c551583421b21a07116ff847ca9e27cd256 Author: tang.junhui Date: Mon Oct 30 14:46:34 2017 -0700 bcache: fix wrong cache_misses statistics [ Upstream commit c157313791a999646901b3e3c6888514ebc36d62 ] Currently, Cache missed IOs are identified by s->cache_miss, but actually, there are many situations that missed IOs are not assigned a value for s->cache_miss in cached_dev_cache_miss(), for example, a bypassed IO (s->iop.bypass = 1), or the cache_bio allocate failed. In these situations, it will go to out_put or out_submit, and s->cache_miss is null, which leads bch_mark_cache_accounting() to treat this IO as a hit IO. [ML: applied by 3-way merge] Signed-off-by: tang.junhui Reviewed-by: Michael Lyle Reviewed-by: Coly Li Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit a7a5473af8b85c8604b5ae5bbbd3e3c077584436 Author: Liang Chen Date: Mon Oct 30 14:46:35 2017 -0700 bcache: explicitly destroy mutex while exiting [ Upstream commit 330a4db89d39a6b43f36da16824eaa7a7509d34d ] mutex_destroy does nothing most of time, but it's better to call it to make the code future proof and it also has some meaning for like mutex debug. As Coly pointed out in a previous review, bcache_exit() may not be able to handle all the references properly if userspace registers cache and backing devices right before bch_debug_init runs and bch_debug_init failes later. So not exposing userspace interface until everything is ready to avoid that issue. Signed-off-by: Liang Chen Reviewed-by: Michael Lyle Reviewed-by: Coly Li Reviewed-by: Eric Wheeler Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit b45f3bb8f1a532b92922ff15a41435679eafdae8 Author: Bob Peterson Date: Wed Sep 20 08:30:04 2017 -0500 GFS2: Take inode off order_write list when setting jdata flag [ Upstream commit cc555b09d8c3817aeebda43a14ab67049a5653f7 ] This patch fixes a deadlock caused when the jdata flag is set for inodes that are already on the ordered write list. Since it is on the ordered write list, log_flush calls gfs2_ordered_write which calls filemap_fdatawrite. But since the inode had the jdata flag set, that calls gfs2_jdata_writepages, which tries to start a new transaction. A new transaction cannot be started because it tries to acquire the log_flush rwsem which is already locked by the log flush operation. The bottom line is: We cannot switch an inode from ordered to jdata until we eliminate any ordered data pages (via log flush) or any log_flush operation afterward will create the circular dependency above. So we need to flush the log before setting the diskflags to switch the file mode, then we need to remove the inode from the ordered writes list. Before this patch, the log flush was done for jdata->ordered, but that's wrong. If we're going from jdata to ordered, we don't need to call gfs2_log_flush because the call to filemap_fdatawrite will do it for us: filemap_fdatawrite() -> __filemap_fdatawrite_range() __filemap_fdatawrite_range() -> do_writepages() do_writepages() -> gfs2_jdata_writepages() gfs2_jdata_writepages() -> gfs2_log_flush() This patch modifies function do_gfs2_set_flags so that if a file has its jdata flag set, and it's already on the ordered write list, the log will be flushed and it will be removed from the list before setting the flag. Signed-off-by: Bob Peterson Signed-off-by: Andreas Gruenbacher Acked-by: Abhijith Das Signed-off-by: Sasha Levin commit bfd3a082492c4bf283e5c783205f67760c30daed Author: Daniel Lezcano Date: Thu Oct 19 19:05:58 2017 +0200 thermal/drivers/step_wise: Fix temperature regulation misbehavior [ Upstream commit 07209fcf33542c1ff1e29df2dbdf8f29cdaacb10 ] There is a particular situation when the cooling device is cpufreq and the heat dissipation is not efficient enough where the temperature increases little by little until reaching the critical threshold and leading to a SoC reset. The behavior is reproducible on a hikey6220 with bad heat dissipation (eg. stacked with other boards). Running a simple C program doing while(1); for each CPU of the SoC makes the temperature to reach the passive regulation trip point and ends up to the maximum allowed temperature followed by a reset. This issue has been also reported by running the libhugetlbfs test suite. What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz while the temperature continues to grow. It appears the step wise governor calls get_target_state() the first time with the throttle set to true and the trend to 'raising'. The code selects logically the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so good. The temperature decreases immediately but still stays greater than the trip point, then get_target_state() is called again, this time with the throttle set to true *and* the trend to 'dropping'. From there the algorithm assumes we have to step down the state and the cpu frequency jumps back to 1.2GHz. But the temperature is still higher than the trip point, so get_target_state() is called with throttle=1 and trend='raising' again, we jump to 900MHz, then get_target_state() is called with throttle=1 and trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not stabilizes and continues to increase. [ 237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 237.922690] thermal cooling_device0: cur_state=0 [ 237.922701] thermal cooling_device0: old_target=0, target=1 [ 238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1 [ 238.026694] thermal cooling_device0: cur_state=1 [ 238.026707] thermal cooling_device0: old_target=1, target=0 [ 238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 238.134679] thermal cooling_device0: cur_state=0 [ 238.134690] thermal cooling_device0: old_target=0, target=1 In this situation the temperature continues to increase while the trend is oscillating between 'dropping' and 'raising'. We need to keep the current state untouched if the throttle is set, so the temperature can decrease or a higher state could be selected, thus preventing this oscillation. Keeping the next_target untouched when 'throttle' is true at 'dropping' time fixes the issue. The following traces show the governor does not change the next state if trend==2 (dropping) and throttle==1. [ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 2306.128021] thermal cooling_device0: cur_state=0 [ 2306.128031] thermal cooling_device0: old_target=0, target=1 [ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1 [ 2306.232030] thermal cooling_device0: cur_state=1 [ 2306.232042] thermal cooling_device0: old_target=1, target=1 [ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1 [ 2306.336021] thermal cooling_device0: cur_state=1 [ 2306.336034] thermal cooling_device0: old_target=1, target=1 [ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0 [ 2306.440022] thermal cooling_device0: cur_state=1 [ 2306.440034] thermal cooling_device0: old_target=1, target=0 [ ... ] After a while, if the temperature continues to increase, the next state becomes 2 which is 720MHz on the hikey. That results in the temperature stabilizing around the trip point. [ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0 [ 2455.832019] thermal cooling_device0: cur_state=1 [ 2455.832032] thermal cooling_device0: old_target=1, target=1 [ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0 [ 2455.936027] thermal cooling_device0: cur_state=1 [ 2455.936040] thermal cooling_device0: old_target=1, target=1 [ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1 [ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0 [ 2456.044023] thermal cooling_device0: cur_state=1 [ 2456.044036] thermal cooling_device0: old_target=1, target=1 [ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1 [ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1 [ 2456.148042] thermal cooling_device0: cur_state=1 [ 2456.148055] thermal cooling_device0: old_target=1, target=2 [ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1 [ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0 [ 2456.252058] thermal cooling_device0: cur_state=2 [ 2456.252075] thermal cooling_device0: old_target=2, target=1 IOW, this change is needed to keep the state for a cooling device if the temperature trend is oscillating while the temperature increases slightly. Without this change, the situation above leads to a catastrophic crash by a hardware reset on hikey. This issue has been reported to happen on an OMAP dra7xx also. Signed-off-by: Daniel Lezcano Cc: Keerthy Cc: John Stultz Cc: Leo Yan Tested-by: Keerthy Reviewed-by: Keerthy Signed-off-by: Eduardo Valentin Signed-off-by: Sasha Levin commit 2b3e37c146695fa8b0f7747e75b64ef807a36f48 Author: Gao Feng Date: Tue Oct 31 18:25:37 2017 +0800 ppp: Destroy the mutex when cleanup [ Upstream commit f02b2320b27c16b644691267ee3b5c110846f49e ] The mutex_destroy only makes sense when enable DEBUG_MUTEX. For the good readbility, it's better to invoke it in exit func when the init func invokes mutex_init. Signed-off-by: Gao Feng Acked-by: Guillaume Nault Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 350e8cd3ccd99691d51da10a79d6954740fa9b09 Author: Michał Mirosław Date: Tue Sep 19 04:48:10 2017 +0200 clk: tegra: Fix cclk_lp divisor register [ Upstream commit 54eff2264d3e9fd7e3987de1d7eba1d3581c631e ] According to comments in code and common sense, cclk_lp uses its own divisor, not cclk_g's. Fixes: b08e8c0ecc42 ("clk: tegra: add clock support for Tegra30") Signed-off-by: Michał Mirosław Acked-By: Peter De Schrijver Signed-off-by: Thierry Reding Signed-off-by: Sasha Levin commit a5579ec0173222b49e104e17aab63cd664d14024 Author: Jan Kara Date: Fri Nov 3 12:21:21 2017 +0100 mm: Handle 0 flags in _calc_vm_trans() macro [ Upstream commit 592e254502041f953e84d091eae2c68cba04c10b ] _calc_vm_trans() does not handle the situation when some of the passed flags are 0 (which can happen if these VM flags do not make sense for the architecture). Improve the _calc_vm_trans() macro to return 0 in such situation. Since all passed flags are constant, this does not add any runtime overhead. Signed-off-by: Jan Kara Signed-off-by: Dan Williams Signed-off-by: Sasha Levin commit 7103f7e1bcc4f3387c371d7ad3f102597e84da0c Author: Suzuki K Poulose Date: Fri Nov 3 11:45:18 2017 +0000 arm-ccn: perf: Prevent module unload while PMU is in use [ Upstream commit c7f5828bf77dcbd61d51f4736c1d5aa35663fbb4 ] When the PMU driver is built as a module, the perf expects the pmu->module to be valid, so that the driver is prevented from being unloaded while it is in use. Fix the CCN pmu driver to fill in this field. Fixes: a33b0daab73a0 ("bus: ARM CCN PMU driver") Cc: Pawel Moll Cc: Will Deacon Acked-by: Mark Rutland Signed-off-by: Suzuki K Poulose Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit abc0400cd8accd1ace0d69b7c0994014aaa9bb7b Author: Jiang Yi Date: Fri Aug 11 11:29:44 2017 +0800 target/file: Do not return error for UNMAP if length is zero [ Upstream commit 594e25e73440863981032d76c9b1e33409ceff6e ] The function fd_execute_unmap() in target_core_file.c calles ret = file->f_op->fallocate(file, mode, pos, len); Some filesystems implement fallocate() to return error if length is zero (e.g. btrfs) but according to SCSI Block Commands spec UNMAP should return success for zero length. Signed-off-by: Jiang Yi Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 787542f14a548dd43113a0556400c59fb990f058 Author: tangwenji Date: Thu Aug 24 19:59:37 2017 +0800 target:fix condition return in core_pr_dump_initiator_port() [ Upstream commit 24528f089d0a444070aa4f715ace537e8d6bf168 ] When is pr_reg->isid_present_at_reg is false,this function should return. This fixes a regression originally introduced by: commit d2843c173ee53cf4c12e7dfedc069a5bc76f0ac5 Author: Andy Grover Date: Thu May 16 10:40:55 2013 -0700 target: Alter core_pr_dump_initiator_port for ease of use Signed-off-by: tangwenji Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 1d3f887ec1417723796770fee371ddbd2f882c9a Author: tangwenji Date: Fri Sep 15 16:03:13 2017 +0800 iscsi-target: fix memory leak in lio_target_tiqn_addtpg() [ Upstream commit 12d5a43b2dffb6cd28062b4e19024f7982393288 ] tpg must free when call core_tpg_register() return fail Signed-off-by: tangwenji Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 36376ec8e77b9e537b29dbb8a1497968c4b25e95 Author: Bart Van Assche Date: Tue Oct 31 11:03:17 2017 -0700 target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd() [ Upstream commit cfe2b621bb18d86e93271febf8c6e37622da2d14 ] Avoid that cmd->se_cmd.se_tfo is read after a command has already been freed. Signed-off-by: Bart Van Assche Cc: Christoph Hellwig Cc: Mike Christie Reviewed-by: Hannes Reinecke Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit ea51d62d96c1ff9c39614fb855aca7d2fc4c68fa Author: Christophe Leroy Date: Wed Oct 18 11:16:47 2017 +0200 powerpc/ipic: Fix status get and status clear [ Upstream commit 6b148a7ce72a7f87c81cbcde48af014abc0516a9 ] IPIC Status is provided by register IPIC_SERSR and not by IPIC_SERMR which is the mask register. Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit d807bd894ab8ee75f166a2929e1467fb91624535 Author: William A. Kennington III Date: Fri Sep 22 16:58:00 2017 -0700 powerpc/opal: Fix EBUSY bug in acquiring tokens [ Upstream commit 71e24d7731a2903b1ae2bba2b2971c654d9c2aa6 ] The current code checks the completion map to look for the first token that is complete. In some cases, a completion can come in but the token can still be on lease to the caller processing the completion. If this completed but unreleased token is the first token found in the bitmap by another tasks trying to acquire a token, then the __test_and_set_bit call will fail since the token will still be on lease. The acquisition will then fail with an EBUSY. This patch reorganizes the acquisition code to look at the opal_async_token_map for an unleased token. If the token has no lease it must have no outstanding completions so we should never see an EBUSY, unless we have leased out too many tokens. Since opal_async_get_token_inrerruptible is protected by a semaphore, we will practically never see EBUSY anymore. Fixes: 8d7248232208 ("powerpc/powernv: Infrastructure to support OPAL async completion") Signed-off-by: William A. Kennington III Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit 703f625378288d753bdd8a22822a5254ff1a0b3f Author: Shriya Date: Fri Oct 13 10:06:41 2017 +0530 powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo [ Upstream commit cd77b5ce208c153260ed7882d8910f2395bfaabd ] The call to /proc/cpuinfo in turn calls cpufreq_quick_get() which returns the last frequency requested by the kernel, but may not reflect the actual frequency the processor is running at. This patch makes a call to cpufreq_get() instead which returns the current frequency reported by the hardware. Fixes: fb5153d05a7d ("powerpc: powernv: Implement ppc_md.get_proc_freq()") Signed-off-by: Shriya Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit fc5d20b5cc21cf82d196bbf173a29cdde4fb9236 Author: Qiang Date: Thu Sep 28 11:54:34 2017 +0800 PCI/PME: Handle invalid data when reading Root Status [ Upstream commit 3ad3f8ce50914288731a3018b27ee44ab803e170 ] PCIe PME and native hotplug share the same interrupt number, so hotplug interrupts are also processed by PME. In some cases, e.g., a Link Down interrupt, a device may be present but unreachable, so when we try to read its Root Status register, the read fails and we get all ones data (0xffffffff). Previously, we interpreted that data as PCI_EXP_RTSTA_PME being set, i.e., "some device has asserted PME," so we scheduled pcie_pme_work_fn(). This caused an infinite loop because pcie_pme_work_fn() tried to handle PME requests until PCI_EXP_RTSTA_PME is cleared, but with the link down, PCI_EXP_RTSTA_PME can't be cleared. Check for the invalid 0xffffffff data everywhere we read the Root Status register. 1469d17dd341 ("PCI: pciehp: Handle invalid data when reading from non-existent devices") added similar checks in the hotplug driver. Signed-off-by: Qiang Zheng [bhelgaas: changelog, also check in pcie_pme_work_fn(), use "~0" to follow other similar checks] Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin commit 7a10bd329fd5a2510cadd9fd6cc54b4436603257 Author: Christophe JAILLET Date: Thu Nov 9 18:09:28 2017 +0100 video: fbdev: au1200fb: Return an error code if a memory allocation fails [ Upstream commit 8cae353e6b01ac3f18097f631cdbceb5ff28c7f3 ] 'ret' is known to be 0 at this point. In case of memory allocation error in 'framebuffer_alloc()', return -ENOMEM instead. Signed-off-by: Christophe JAILLET Cc: Tejun Heo Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Sasha Levin commit f1530123dba028c4a656bff29c0706a94cf95c9d Author: Christophe JAILLET Date: Thu Nov 9 18:09:28 2017 +0100 video: fbdev: au1200fb: Release some resources if a memory allocation fails [ Upstream commit 451f130602619a17c8883dd0b71b11624faffd51 ] We should go through the error handling code instead of returning -ENOMEM directly. Signed-off-by: Christophe JAILLET Cc: Tejun Heo Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Sasha Levin commit 46c0f37ae53cede02cfde76587a86163c5158708 Author: Ladislav Michl Date: Thu Nov 9 18:09:30 2017 +0100 video: udlfb: Fix read EDID timeout [ Upstream commit c98769475575c8a585f5b3952f4b5f90266f699b ] While usb_control_msg function expects timeout in miliseconds, a value of HZ is used. Replace it with USB_CTRL_GET_TIMEOUT and also fix error message which looks like: udlfb: Read EDID byte 78 failed err ffffff92 as error is either negative errno or number of bytes transferred use %d format specifier. Returned EDID is in second byte, so return error when less than two bytes are received. Fixes: 18dffdf8913a ("staging: udlfb: enhance EDID and mode handling support") Signed-off-by: Ladislav Michl Cc: Bernie Thompson Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Sasha Levin commit c94e3801ad2c95cce2e17c32a58d3550d621d3be Author: Geert Uytterhoeven Date: Thu Nov 9 18:09:33 2017 +0100 fbdev: controlfb: Add missing modes to fix out of bounds access [ Upstream commit ac831a379d34109451b3c41a44a20ee10ecb615f ] Dan's static analysis says: drivers/video/fbdev/controlfb.c:560 control_setup() error: buffer overflow 'control_mac_modes' 20 <= 21 Indeed, control_mac_modes[] has only 20 elements, while VMODE_MAX is 22, which may lead to an out of bounds read when parsing vmode commandline options. The bug was introduced in v2.4.5.6, when 2 new modes were added to macmodes.h, but control_mac_modes[] wasn't updated: https://kernel.opensuse.org/cgit/kernel/diff/include/video/macmodes.h?h=v2.5.2&id=29f279c764808560eaceb88fef36cbc35c529aad Augment control_mac_modes[] with the two new video modes to fix this. Reported-by: Dan Carpenter Signed-off-by: Geert Uytterhoeven Cc: Dan Carpenter Cc: Benjamin Herrenschmidt Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Sasha Levin commit 30d5fde364f7137fe5f95bb07200ae0eb265d81b Author: Mike Christie Date: Wed Mar 1 23:13:26 2017 -0600 target: Use system workqueue for ALUA transitions [ Upstream commit 207ee84133c00a8a2a5bdec94df4a5b37d78881c ] If tcmu-runner is processing a STPG and needs to change the kernel's ALUA state then we cannot use the same work queue for task management requests and ALUA transitions, because we could deadlock. The problem occurs when a STPG times out before tcmu-runner is able to call into target_tg_pt_gp_alua_access_state_store-> core_alua_do_port_transition -> core_alua_do_transition_tg_pt -> queue_work. In this case, the tmr is on the work queue waiting for the STPG to complete, but the STPG transition is now queued behind the waiting tmr. Note: This bug will also be fixed by this patch: http://www.spinics.net/lists/target-devel/msg14560.html which switches the tmr code to use the system workqueues. For both, I am not sure if we need a dedicated workqueue since it is not a performance path and I do not think we need WQ_MEM_RECLAIM to make forward progress to free up memory like the block layer does. Signed-off-by: Mike Christie Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 768b1d1dbb46f8c7525c30e8cdb4cd1f38d98e99 Author: Zygo Blaxell Date: Fri Mar 10 16:45:44 2017 -0500 btrfs: add missing memset while reading compressed inline extents [ Upstream commit e1699d2d7bf6e6cce3e1baff19f9dd4595a58664 ] This is a story about 4 distinct (and very old) btrfs bugs. Commit c8b978188c ("Btrfs: Add zlib compression support") added three data corruption bugs for inline extents (bugs #1-3). Commit 93c82d5750 ("Btrfs: zero page past end of inline file items") fixed bug #1: uncompressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read. The fix was to add a memset in btrfs_get_extent to zero out the hole. Commit 166ae5a418 ("btrfs: fix inline compressed read err corruption") fixed bug #2: compressed inline extents which contained non-zero bytes might be replaced with zero bytes in some cases. This patch removed an unhelpful memset from uncompress_inline, but the case where memset is required was missed. There is also a memset in the decompression code, but this only covers decompressed data that is shorter than the ram_bytes from the extent ref record. This memset doesn't cover the region between the end of the decompressed data and the end of the page. It has also moved around a few times over the years, so there's no single patch to refer to. This patch fixes bug #3: compressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read (i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/). The fix is the same: zero out the hole in the compressed case too, by putting a memset back in uncompress_inline, but this time with correct parameters. The last and oldest bug, bug #0, is the cause of the offending inline extent/hole/extent pattern. Bug #0 is a subtle and mostly-harmless quirk of behavior somewhere in the btrfs write code. In a few special cases, an inline extent and hole are allowed to persist where they normally would be combined with later extents in the file. A fast reproducer for bug #0 is presented below. A few offending extents are also created in the wild during large rsync transfers with the -S flag. A Linux kernel build (git checkout; make allyesconfig; make -j8) will produce a handful of offending files as well. Once an offending file is created, it can present different content to userspace each time it is read. Bug #0 is at least 4 and possibly 8 years old. I verified every vX.Y kernel back to v3.5 has this behavior. There are fossil records of this bug's effects in commits all the way back to v2.6.32. I have no reason to believe bug #0 wasn't present at the beginning of btrfs compression support in v2.6.29, but I can't easily test kernels that old to be sure. It is not clear whether bug #0 is worth fixing. A fix would likely require injecting extra reads into currently write-only paths, and most of the exceptional cases caused by bug #0 are already handled now. Whether we like them or not, bug #0's inline extents followed by holes are part of the btrfs de-facto disk format now, and we need to be able to read them without data corruption or an infoleak. So enough about bug #0, let's get back to bug #3 (this patch). An example of on-disk structure leading to data corruption found in the wild: item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160 inode generation 50 transid 50 size 47424 nbytes 49141 block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20 inode ref index 3 namelen 10 name: DB_File.so item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362 inline extent data size 1341 ram 4085 compress(zlib) item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53 extent data disk byte 5367308288 nr 20480 extent data offset 0 nr 45056 ram 45056 extent compression(zlib) Different data appears in userspace during each read of the 11 bytes between 4085 and 4096. The extent in item 63 is not long enough to fill the first page of the file, so a memset is required to fill the space between item 63 (ending at 4085) and item 64 (beginning at 4096) with zero. Here is a reproducer from Liu Bo, which demonstrates another method of creating the same inline extent and hole pattern: Using 'page_poison=on' kernel command line (or enable CONFIG_PAGE_POISONING) run the following: # touch foo # chattr +c foo # xfs_io -f -c "pwrite -W 0 1000" foo # xfs_io -f -c "falloc 4 8188" foo # od -x foo # echo 3 >/proc/sys/vm/drop_caches # od -x foo This produce the following on my box: Correct output: file contains 1000 data bytes followed by zeros: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000 0001760 0000 0000 0000 0000 0000 0000 0000 0000 * 0020000 Actual output: the data after the first 1000 bytes will be different each run: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d 0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400 0002000 435f 0056 5f74 6164 7400 645f 0062 5f74 (...) Signed-off-by: Zygo Blaxell Reviewed-by: Liu Bo Reviewed-by: Chris Mason Signed-off-by: Chris Mason Signed-off-by: Sasha Levin commit 265926de6f07bcc02059f03bfc57dd1fda13b032 Author: Olga Kornievskaia Date: Wed Mar 8 14:39:15 2017 -0500 NFSv4.1 respect server's max size in CREATE_SESSION [ Upstream commit 033853325fe3bdc70819a8b97915bd3bca41d3af ] Currently client doesn't respect max sizes server returns in CREATE_SESSION. nfs4_session_set_rwsize() gets called and server->rsize, server->wsize are 0 so they never get set to the sizes returned by the server. Signed-off-by: Olga Kornievskaia Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit c7da20b8ca027a7554224814e1f2d098d1268e3b Author: Daniel Borkmann Date: Wed Mar 15 22:53:37 2017 +0100 perf symbols: Fix symbols__fixup_end heuristic for corner cases [ Upstream commit e7ede72a6d40cb3a30c087142d79381ca8a31dab ] The current symbols__fixup_end() heuristic for the last entry in the rb tree is suboptimal as it leads to not being able to recognize the symbol in the call graph in a couple of corner cases, for example: i) If the symbol has a start address (f.e. exposed via kallsyms) that is at a page boundary, then the roundup(curr->start, 4096) for the last entry will result in curr->start == curr->end with a symbol length of zero. ii) If the symbol has a start address that is shortly before a page boundary, then also here, curr->end - curr->start will just be very few bytes, where it's unrealistic that we could perform a match against. Instead, change the heuristic to roundup(curr->start, 4096) + 4096, so that we can catch such corner cases and have a better chance to find that specific symbol. It's still just best effort as the real end of the symbol is unknown to us (and could even be at a larger offset than the current range), but better than the current situation. Alexei reported that he recently run into case i) with a JITed eBPF program (these are all page aligned) as the last symbol which wasn't properly shown in the call graph (while other eBPF program symbols in the rb tree were displayed correctly). Since this is a generic issue, lets try to improve the heuristic a bit. Reported-and-Tested-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Fixes: 2e538c4a1847 ("perf tools: Improve kernel/modules symbol lookup") Link: http://lkml.kernel.org/r/bb5c80d27743be6f12afc68405f1956a330e1bc9.1489614365.git.daniel@iogearbox.net Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit 889408d364b74df8b175617885d762a74c690315 Author: Jack Morgenstein Date: Mon Mar 13 19:29:08 2017 +0200 net/mlx4_core: Avoid delays during VF driver device shutdown [ Upstream commit 4cbe4dac82e423ecc9a0ba46af24a860853259f4 ] Some Hypervisors detach VFs from VMs by instantly causing an FLR event to be generated for a VF. In the mlx4 case, this will cause that VF's comm channel to be disabled before the VM has an opportunity to invoke the VF device's "shutdown" method. For such Hypervisors, there is a race condition between the VF's shutdown method and its internal-error detection/reset thread. The internal-error detection/reset thread (which runs every 5 seconds) also detects a disabled comm channel. If the internal-error detection/reset flow wins the race, we still get delays (while that flow tries repeatedly to detect comm-channel recovery). The cited commit fixed the command timeout problem when the internal-error detection/reset flow loses the race. This commit avoids the unneeded delays when the internal-error detection/reset flow wins. Fixes: d585df1c5ccf ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown") Signed-off-by: Jack Morgenstein Reported-by: Simon Xiao Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4f5f3108254093bad63632335bdc245e02eb4427 Author: David Howells Date: Thu Mar 16 16:27:48 2017 +0000 afs: Fix afs_kill_pages() [ Upstream commit 7286a35e893176169b09715096a4aca557e2ccd2 ] Fix afs_kill_pages() in two ways: (1) If a writeback has been partially flushed, then if we try and kill the pages it contains, some of them may no longer be undergoing writeback and end_page_writeback() will assert. Fix this by checking to see whether the page in question is actually undergoing writeback before ending that writeback. (2) The loop that scans for pages to kill doesn't increase the first page index, and so the loop may not terminate, but it will try to process the same pages over and over again. Fix this by increasing the first page index to one after the last page we processed. Signed-off-by: David Howells Signed-off-by: Sasha Levin commit b38a026e7efd5d260fdb52160d51cd1e29b8a476 Author: David Howells Date: Thu Mar 16 16:27:48 2017 +0000 afs: Fix page leak in afs_write_begin() [ Upstream commit 6d06b0d25209c80e99c1e89700f1e09694a3766b ] afs_write_begin() leaks a ref and a lock on a page if afs_fill_page() fails. Fix the leak by unlocking and releasing the page in the error path. Signed-off-by: David Howells Signed-off-by: Sasha Levin commit fe218174a8cd439eff908001a7ebcdfd6786a114 Author: Marc Dionne Date: Thu Mar 16 16:27:47 2017 +0000 afs: Populate and use client modification time [ Upstream commit ab94f5d0dd6fd82e7eeca5e7c8096eaea0a0261f ] The inode timestamps should be set from the client time in the status received from the server, rather than the server time which is meant for internal server use. Set AFS_SET_MTIME and populate the mtime for operations that take an input status, such as file/dir creation and StoreData. If an input time is not provided the server will set the vnode times based on the current server time. In a situation where the server has some skew with the client, this could lead to the client seeing a timestamp in the future for a file that it just created or wrote. Signed-off-by: Marc Dionne Signed-off-by: David Howells Signed-off-by: Sasha Levin commit 6057493b8d401c2586479fe1e19accb883398feb Author: David Howells Date: Thu Mar 16 16:27:47 2017 +0000 afs: Fix the maths in afs_fs_store_data() [ Upstream commit 146a1192783697810b63a1e41c4d59fc93387340 ] afs_fs_store_data() works out of the size of the write it's going to make, but it uses 32-bit unsigned subtraction in one place that gets automatically cast to loff_t. However, if to < offset, then the number goes negative, but as the result isn't signed, this doesn't get sign-extended to 64-bits when placed in a loff_t. Fix by casting the operands to loff_t. Signed-off-by: David Howells Signed-off-by: Sasha Levin commit 6002fff6d220c112935fa96a2d796528b71aac05 Author: Tina Ruchandani Date: Thu Mar 16 16:27:46 2017 +0000 afs: Prevent callback expiry timer overflow [ Upstream commit 56e714312e7dbd6bb83b2f78d3ec19a404c7649f ] get_seconds() returns real wall-clock seconds. On 32-bit systems this value will overflow in year 2038 and beyond. This patch changes afs_vnode record to use ktime_get_real_seconds() instead, for the fields cb_expires and cb_expires_at. Signed-off-by: Tina Ruchandani Signed-off-by: David Howells Signed-off-by: Sasha Levin commit 3e8f3e799ee57e69b4b1592e9f43005f7e474e4e Author: Tina Ruchandani Date: Thu Mar 16 16:27:46 2017 +0000 afs: Migrate vlocation fields to 64-bit [ Upstream commit 8a79790bf0b7da216627ffb85f52cfb4adbf1e4e ] get_seconds() returns real wall-clock seconds. On 32-bit systems this value will overflow in year 2038 and beyond. This patch changes afs's vlocation record to use ktime_get_real_seconds() instead, for the fields time_of_death and update_at. Signed-off-by: Tina Ruchandani Signed-off-by: David Howells Signed-off-by: Sasha Levin commit a4855bd91dc695528e5f2175cd3f62baa4cad812 Author: David Howells Date: Thu Mar 16 16:27:45 2017 +0000 afs: Flush outstanding writes when an fd is closed [ Upstream commit 58fed94dfb17e89556b5705f20f90e5b2971b6a1 ] Flush outstanding writes in afs when an fd is closed. This is what NFS and CIFS do. Reported-by: Marc Dionne Signed-off-by: David Howells Signed-off-by: Sasha Levin commit 878b514971dccc110ab715513b4349983904aeda Author: Marc Dionne Date: Thu Mar 16 16:27:44 2017 +0000 afs: Adjust mode bits processing [ Upstream commit 627f46943ff90bcc32ddeb675d881c043c6fa2ae ] Mode bits for an afs file should not be enforced in the usual way. For files, the absence of user bits can restrict file access with respect to what is granted by the server. These bits apply regardless of the owner or the current uid; the rest of the mode bits (group, other) are ignored. Signed-off-by: Marc Dionne Signed-off-by: David Howells Signed-off-by: Sasha Levin commit 70f9b817ed9032b8b598c7e60b5508d007bf6e37 Author: Marc Dionne Date: Thu Mar 16 16:27:43 2017 +0000 afs: Populate group ID from vnode status [ Upstream commit 6186f0788b31f44affceeedc7b48eb10faea120d ] The group was hard coded to GLOBAL_ROOT_GID; use the group ID that was received from the server. Signed-off-by: Marc Dionne Signed-off-by: David Howells Signed-off-by: Sasha Levin commit 416538a008726455cda2fd90eee5a03c875a3e37 Author: David Howells Date: Thu Mar 16 16:27:43 2017 +0000 afs: Fix missing put_page() [ Upstream commit 29c8bbbd6e21daa0997d1c3ee886b897ee7ad652 ] In afs_writepages_region(), inside the loop where we find dirty pages to deal with, one of the if-statements is missing a put_page(). Signed-off-by: David Howells Signed-off-by: Sasha Levin commit d70aca3a2e967822b307fa272ae38875f5028667 Author: Steven Rostedt (VMware) Date: Thu Mar 2 15:10:59 2017 +0100 sched/deadline: Use deadline instead of period when calculating overflow [ Upstream commit 2317d5f1c34913bac5971d93d69fb6c31bb74670 ] I was testing Daniel's changes with his test case, and tweaked it a little. Instead of having the runtime equal to the deadline, I increased the deadline ten fold. Daniel's test case had: attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */ attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */ attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */ To make it more interesting, I changed it to: attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */ attr.sched_deadline = 20 * 1000 * 1000; /* 20 ms */ attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */ The results were rather surprising. The behavior that Daniel's patch was fixing came back. The task started using much more than .1% of the CPU. More like 20%. Looking into this I found that it was due to the dl_entity_overflow() constantly returning true. That's because it uses the relative period against relative runtime vs the absolute deadline against absolute runtime. runtime / (deadline - t) > dl_runtime / dl_period There's even a comment mentioning this, and saying that when relative deadline equals relative period, that the equation is the same as using deadline instead of period. That comment is backwards! What we really want is: runtime / (deadline - t) > dl_runtime / dl_deadline We care about if the runtime can make its deadline, not its period. And then we can say "when the deadline equals the period, the equation is the same as using dl_period instead of dl_deadline". After correcting this, now when the task gets enqueued, it can throttle correctly, and Daniel's fix to the throttling of sleeping deadline tasks works even when the runtime and deadline are not the same. Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Daniel Bristot de Oliveira Cc: Juri Lelli Cc: Linus Torvalds Cc: Luca Abeni Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Romulo Silva de Oliveira Cc: Steven Rostedt Cc: Thomas Gleixner Cc: Tommaso Cucinotta Link: http://lkml.kernel.org/r/02135a27f1ae3fe5fd032568a5a2f370e190e8d7.1488392936.git.bristot@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit f67d6545f01b0307e55baf97bbdc21dec7ea1c33 Author: Don Brace Date: Fri Mar 10 14:35:17 2017 -0600 scsi: hpsa: limit outstanding rescans [ Upstream commit 87b9e6aa87d9411f1059aa245c0c79976bc557ac ] Avoid rescan storms. No need to queue another if one is pending. Reviewed-by: Scott Benesh Reviewed-by: Scott Teel Reviewed-by: Tomas Henzl Signed-off-by: Don Brace Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 50c09c63594385b7e66feae26235925eb70b5759 Author: Stafford Horne Date: Mon Mar 13 07:44:45 2017 +0900 openrisc: fix issue handling 8 byte get_user calls [ Upstream commit 154e67cd8e8f964809d0e75e44bb121b169c75b3 ] Was getting the following error with allmodconfig: ERROR: "__get_user_bad" [lib/test_user_copy.ko] undefined! This was simply a missing break statement, causing an unwanted fall through. Signed-off-by: Stafford Horne Signed-off-by: Sasha Levin commit 825e8f8cd88fef30039f6c97210e0651ae59df4a Author: Vlad Yasevich Date: Tue Mar 14 08:58:08 2017 -0400 net: Resend IGMP memberships upon peer notification. [ Upstream commit 37c343b4f4e70e9dc328ab04903c0ec8d154c1a4 ] When we notify peers of potential changes, it's also good to update IGMP memberships. For example, during VM migration, updating IGMP memberships will redirect existing multicast streams to the VM at the new location. Signed-off-by: Vladislav Yasevich Acked-by: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a7eaeaf53fdd70df40d295e752166faf702d80f4 Author: Matthias Kaehlcke Date: Mon Mar 13 14:30:29 2017 -0700 dmaengine: Fix array index out of bounds warning in __get_unmap_pool() [ Upstream commit 23f963e91fd81f44f6b316b1c24db563354c6be8 ] This fixes the following warning when building with clang and CONFIG_DMA_ENGINE_RAID=n : drivers/dma/dmaengine.c:1102:11: error: array index 2 is past the end of the array (which contains 1 element) [-Werror,-Warray-bounds] return &unmap_pool[2]; ^ ~ drivers/dma/dmaengine.c:1083:1: note: array 'unmap_pool' declared here static struct dmaengine_unmap_pool unmap_pool[] = { ^ drivers/dma/dmaengine.c:1104:11: error: array index 3 is past the end of the array (which contains 1 element) [-Werror,-Warray-bounds] return &unmap_pool[3]; ^ ~ drivers/dma/dmaengine.c:1083:1: note: array 'unmap_pool' declared here static struct dmaengine_unmap_pool unmap_pool[] = { Signed-off-by: Matthias Kaehlcke Reviewed-by: Dan Williams Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 978a56a324ad7e5e52ccbaa5bff2ab76ca4dcb4f Author: Johan Hovold Date: Mon Mar 13 13:42:03 2017 +0100 net: wimax/i2400m: fix NULL-deref at probe [ Upstream commit 6e526fdff7be4f13b24f929a04c0e9ae6761291e ] Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. The endpoints are specifically dereferenced in the i2400m_bootrom_init path during probe (e.g. in i2400mu_tx_bulk_out). Fixes: f398e4240fce ("i2400m/USB: probe/disconnect, dev init/shutdown and reset backends") Cc: Inaky Perez-Gonzalez Signed-off-by: Johan Hovold Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6f7feb80d24c956b3f3c731a7dec11b6650dae92 Author: Dmitry Torokhov Date: Tue Feb 28 17:14:41 2017 -0800 Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list [ Upstream commit a4c2a13129f7c5bcf81704c06851601593303fd5 ] TUXEDO BU1406 does not implement active multiplexing mode properly, and takes around 550 ms in i8042_set_mux_mode(). Given that the device does not have external AUX port, there is no downside in disabling the MUX mode. Reported-by: Paul Menzel Suggested-by: Vojtech Pavlik Reviewed-by: Marcos Paulo de Souza Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin commit 7c6d4318dce23369a144d5ebe9669395ac705e15 Author: NeilBrown Date: Fri Mar 10 11:36:39 2017 +1100 NFSD: fix nfsd_reset_versions for NFSv4. [ Upstream commit 800a938f0bf9130c8256116649c0cc5806bfb2fd ] If you write "-2 -3 -4" to the "versions" file, it will notice that no versions are enabled, and nfsd_reset_versions() is called. This enables all major versions, not no minor versions. So we lose the invariant that NFSv4 is only advertised when at least one minor is enabled. Fix the code to explicitly enable minor versions for v4, change it to use nfsd_vers() to test and set, and simplify the code. Signed-off-by: NeilBrown Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin commit a24c04fb2b307be0e0bc4a6e2e12ecacf8e90ed1 Author: NeilBrown Date: Fri Mar 10 11:36:39 2017 +1100 NFSD: fix nfsd_minorversion(.., NFSD_AVAIL) [ Upstream commit 928c6fb3a9bfd6c5b287aa3465226add551c13c0 ] Current code will return 1 if the version is supported, and -1 if it isn't. This is confusing and inconsistent with the one place where this is used. So change to return 1 if it is supported, and zero if not. i.e. an error is never returned. Signed-off-by: NeilBrown Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin commit d3ecf7891a2b3dce47a961481e77b3f41c468290 Author: Doug Berger Date: Thu Mar 9 16:58:48 2017 -0800 net: bcmgenet: Power up the internal PHY before probing the MII [ Upstream commit 6be371b053dc86f11465cc1abce2e99bda0a0574 ] When using the internal PHY it must be powered up when the MII is probed or the PHY will not be detected. Since the PHY is powered up at reset this has not been a problem. However, when the kernel is restarted with kexec the PHY will likely be powered down when the kernel starts so it will not be detected and the Ethernet link will not be established. This commit explicitly powers up the internal PHY when the GENET driver is probed to correct this behavior. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3c84a327df800c2eb5040b9198d865c9fbd333c0 Author: Doug Berger Date: Thu Mar 9 16:58:45 2017 -0800 net: bcmgenet: reserved phy revisions must be checked first [ Upstream commit eca4bad73409aedc6ff22f823c18b67a4f08c851 ] The reserved gphy_rev value of 0x01ff must be tested before the old or new scheme for GPHY major versioning are tested, otherwise it will be treated as 0xff00 according to the old scheme. Fixes: b04a2f5b9ff5 ("net: bcmgenet: add support for new GENET PHY revision scheme") Signed-off-by: Doug Berger Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0e4db24608573ece6ea98914960c71f33c8288d7 Author: Doug Berger Date: Thu Mar 9 16:58:44 2017 -0800 net: bcmgenet: correct MIB access of UniMAC RUNT counters [ Upstream commit 1ad3d225e5a40ca6c586989b4baaca710544c15a ] The gap between the Tx status counters and the Rx RUNT counters is now being added to allow correct reporting of the registers. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0594bbed16ac83b0d7626cc14c11eacb5c87b590 Author: Doug Berger Date: Thu Mar 9 16:58:43 2017 -0800 net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values [ Upstream commit ffff71328a3c321f7c14cc1edd33577717037744 ] The location of the RBUF overflow and error counters has moved between different version of the GENET MAC. This commit corrects the driver to read from the correct locations depending on the version of the GENET MAC. Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file") Signed-off-by: Doug Berger Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b8fcc2ddaeffd3ed32437c7a52d9f6ef189450a7 Author: Alexander Potapenko Date: Wed Mar 8 18:08:16 2017 +0100 net: initialize msg.msg_flags in recvfrom [ Upstream commit 9f138fa609c47403374a862a08a41394be53d461 ] KMSAN reports a use of uninitialized memory in put_cmsg() because msg.msg_flags in recvfrom haven't been initialized properly. The flag values don't affect the result on this path, but it's still a good idea to initialize them explicitly. Signed-off-by: Alexander Potapenko Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 23d1aa2220b016890938281138beafe7b9175050 Author: Guoqing Jiang Date: Fri Feb 24 11:15:12 2017 +0800 md-cluster: free md_cluster_info if node leave cluster [ Upstream commit 9c8043f337f14d1743006dfc59c03e80a42e3884 ] To avoid memory leak, we need to free the cinfo which is allocated when node join cluster. Reviewed-by: NeilBrown Signed-off-by: Guoqing Jiang Signed-off-by: Shaohua Li Signed-off-by: Sasha Levin commit 2e9bfadeec14b8bc1b2c0486bc24cb868ca81a99 Author: Javier Martinez Canillas Date: Wed Feb 22 15:23:22 2017 -0300 usb: phy: isp1301: Add OF device ID table [ Upstream commit fd567653bdb908009b650f079bfd4b63169e2ac4 ] The driver doesn't have a struct of_device_id table but supported devices are registered via Device Trees. This is working on the assumption that a I2C device registered via OF will always match a legacy I2C device ID and that the MODALIAS reported will always be of the form i2c:. But this could change in the future so the correct approach is to have an OF device ID table if the devices are registered via OF. Signed-off-by: Javier Martinez Canillas Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 8445dd361d31cebdb8ecaaf356258bf27ebe4cce Author: Ilan peer Date: Mon Dec 26 18:17:36 2016 +0200 mac80211: Fix addition of mesh configuration element [ Upstream commit 57629915d568c522ac1422df7bba4bee5b5c7a7c ] The code was setting the capabilities byte to zero, after it was already properly set previously. Fix it. The bug was found while debugging hwsim mesh tests failures that happened since the commit mentioned below. Fixes: 76f43b4c0a93 ("mac80211: Remove invalid flag operations in mesh TSF synchronization") Signed-off-by: Ilan Peer Reviewed-by: Masashi Honma Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 27bc8f4750a7e61fdbeaf654e6194cada59df75b Author: Chandan Rajendra Date: Mon Dec 11 15:00:57 2017 -0500 ext4: fix crash when a directory's i_size is too small [ Upstream commit 9d5afec6b8bd46d6ed821aa1579634437f58ef1f ] On a ppc64 machine, when mounting a fuzzed ext2 image (generated by fsfuzzer) the following call trace is seen, VFS: brelse: Trying to free free buffer WARNING: CPU: 1 PID: 6913 at /root/repos/linux/fs/buffer.c:1165 .__brelse.part.6+0x24/0x40 .__brelse.part.6+0x20/0x40 (unreliable) .ext4_find_entry+0x384/0x4f0 .ext4_lookup+0x84/0x250 .lookup_slow+0xdc/0x230 .walk_component+0x268/0x400 .path_lookupat+0xec/0x2d0 .filename_lookup+0x9c/0x1d0 .vfs_statx+0x98/0x140 .SyS_newfstatat+0x48/0x80 system_call+0x58/0x6c This happens because the directory that ext4_find_entry() looks up has inode->i_size that is less than the block size of the filesystem. This causes 'nblocks' to have a value of zero. ext4_bread_batch() ends up not reading any of the directory file's blocks. This renders the entries in bh_use[] array to continue to have garbage data. buffer_uptodate() on bh_use[0] can then return a zero value upon which brelse() function is invoked. This commit fixes the bug by returning -ENOENT when the directory file has no associated blocks. Reported-by: Abdul Haleem Signed-off-by: Chandan Rajendra Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 47711554c6da1a8904abb2ba43d2f7ee1b1b5a00 Author: Eryu Guan Date: Sun Dec 3 22:52:51 2017 -0500 ext4: fix fdatasync(2) after fallocate(2) operation [ Upstream commit c894aa97577e47d3066b27b32499ecf899bfa8b0 ] Currently, fallocate(2) with KEEP_SIZE followed by a fdatasync(2) then crash, we'll see wrong allocated block number (stat -c %b), the blocks allocated beyond EOF are all lost. fstests generic/468 exposes this bug. Commit 67a7d5f561f4 ("ext4: fix fdatasync(2) after extent manipulation operations") fixed all the other extent manipulation operation paths such as hole punch, zero range, collapse range etc., but forgot the fallocate case. So similarly, fix it by recording the correct journal tid in ext4 inode in fallocate(2) path, so that ext4_sync_file() will wait for the right tid to be committed on fdatasync(2). This addresses the test failure in xfstests test generic/468. Signed-off-by: Eryu Guan Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 6b92b07e2cb0e90327f03928cadfcbb625d33378 Author: Adam Wallis Date: Mon Nov 27 10:45:01 2017 -0500 dmaengine: dmatest: move callback wait queue to thread context [ Upstream commit 6f6a23a213be51728502b88741ba6a10cda2441d ] Commit adfa543e7314 ("dmatest: don't use set_freezable_with_signal()") introduced a bug (that is in fact documented by the patch commit text) that leaves behind a dangling pointer. Since the done_wait structure is allocated on the stack, future invocations to the DMATEST can produce undesirable results (e.g., corrupted spinlocks). Commit a9df21e34b42 ("dmaengine: dmatest: warn user when dma test times out") attempted to WARN the user that the stack was likely corrupted but did not fix the actual issue. This patch fixes the issue by pushing the wait queue and callback structs into the the thread structure. If a failure occurs due to time, dmaengine_terminate_all will force the callback to safely call wake_up_all() without possibility of using a freed pointer. Cc: stable@vger.kernel.org Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197605 Fixes: adfa543e7314 ("dmatest: don't use set_freezable_with_signal()") Reviewed-by: Sinan Kaya Suggested-by: Shunyong Yang Signed-off-by: Adam Wallis Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 412551c467675878bbf55667e478da7cddcd66a4 Author: Mathias Nyman Date: Fri Dec 8 18:10:05 2017 +0200 xhci: Don't add a virt_dev to the devs array before it's fully allocated [ Upstream commit 5d9b70f7d52eb14bb37861c663bae44de9521c35 ] Avoid null pointer dereference if some function is walking through the devs array accessing members of a new virt_dev that is mid allocation. Add the virt_dev to xhci->devs[i] _after_ the virt_device and all its members are properly allocated. issue found by KASAN: null-ptr-deref in xhci_find_slot_id_by_port "Quick analysis suggests that xhci_alloc_virt_device() is not mutex protected. If so, there is a time frame where xhci->devs[slot_id] is set but not fully initialized. Specifically, xhci->devs[i]->udev can be NULL." Cc: stable Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit e1d46b53a6442b23bb4de8ed59c34a75cc6d25d4 Author: Sukumar Ghorai Date: Wed Aug 16 14:46:55 2017 -0700 Bluetooth: btusb: driver to enable the usb-wakeup feature [ Upstream commit a0085f2510e8976614ad8f766b209448b385492f ] BT-Controller connected as platform non-root-hub device and usb-driver initialize such device with wakeup disabled, Ref. usb_new_device(). At present wakeup-capability get enabled by hid-input device from usb function driver(e.g. BT HID device) at runtime. Again some functional driver does not set usb-wakeup capability(e.g LE HID device implement as HID-over-GATT), and can't wakeup the host on USB. Most of the device operation (such as mass storage) initiated from host (except HID) and USB wakeup aligned with host resume procedure. For BT device, usb-wakeup capability need to enable form btusc driver as a generic solution for multiple profile use case and required for USB remote wakeup (in-bus wakeup) while host is suspended. Also usb-wakeup feature need to enable/disable with HCI interface up and down. Signed-off-by: Sukumar Ghorai Signed-off-by: Amit K Bag Acked-by: Oliver Neukum Signed-off-by: Marcel Holtmann Signed-off-by: Sasha Levin commit 5319d08ca465eec277d04b5a3cee34f80b601c74 Author: Shuah Khan Date: Thu Dec 7 14:16:50 2017 -0700 usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer [ Upstream commit be6123df1ea8f01ee2f896a16c2b7be3e4557a5a ] stub_send_ret_submit() handles urb with a potential null transfer_buffer, when it replays a packet with potential malicious data that could contain a null buffer. Add a check for the condition when actual_length > 0 and transfer_buffer is null. Reported-by: Secunia Research Cc: stable Signed-off-by: Shuah Khan Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 5b2323b62af18be000ef627f302b5bf167402de6 Author: Alan Stern Date: Tue Dec 12 14:25:13 2017 -0500 USB: core: prevent malicious bNumInterfaces overflow [ Upstream commit 48a4ff1c7bb5a32d2e396b03132d20d552c0eca7 ] A malicious USB device with crafted descriptors can cause the kernel to access unallocated memory by setting the bNumInterfaces value too high in a configuration descriptor. Although the value is adjusted during parsing, this adjustment is skipped in one of the error return paths. This patch prevents the problem by setting bNumInterfaces to 0 initially. The existing code already sets it to the proper value after parsing is complete. Signed-off-by: Alan Stern Reported-by: Andrey Konovalov CC: Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 6be358198e6ec261dad9b9b0f735dbb741a4fd72 Author: David Kozub Date: Tue Dec 5 22:40:04 2017 +0100 USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID [ Upstream commit 62354454625741f0569c2cbe45b2d192f8fd258e ] There is another JMS567-based USB3 UAS enclosure (152d:0578) that fails with the following error: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [sda] tag#0 Sense Key : Illegal Request [current] [sda] tag#0 Add. Sense: Invalid field in cdb The issue occurs both with UAS (occasionally) and mass storage (immediately after mounting a FS on a disk in the enclosure). Enabling US_FL_BROKEN_FUA quirk solves this issue. This patch adds an UNUSUAL_DEV with US_FL_BROKEN_FUA for the enclosure for both UAS and mass storage. Signed-off-by: David Kozub Acked-by: Alan Stern Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit d3c33f3be0db46d534332f09217a1c5cf745c3de Author: Changbin Du Date: Thu Nov 30 11:39:43 2017 +0800 tracing: Allocate mask_str buffer dynamically [ Upstream commit 90e406f96f630c07d631a021fd4af10aac913e77 ] The default NR_CPUS can be very large, but actual possible nr_cpu_ids usually is very small. For my x86 distribution, the NR_CPUS is 8192 and nr_cpu_ids is 4. About 2 pages are wasted. Most machines don't have so many CPUs, so define a array with NR_CPUS just wastes memory. So let's allocate the buffer dynamically when need. With this change, the mutext tracing_cpumask_update_lock also can be removed now, which was used to protect mask_str. Link: http://lkml.kernel.org/r/1512013183-19107-1-git-send-email-changbin.du@intel.com Fixes: 36dfe9252bd4c ("ftrace: make use of tracing_cpumask") Cc: stable@vger.kernel.org Signed-off-by: Changbin Du Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin commit 705277dab0bf482aee987e2c92139c590ac72922 Author: NeilBrown Date: Thu Dec 14 15:32:38 2017 -0800 autofs: fix careless error in recent commit [ Upstream commit 302ec300ef8a545a7fc7f667e5fd743b091c2eeb ] Commit ecc0c469f277 ("autofs: don't fail mount for transient error") was meant to replace an 'if' with a 'switch', but instead added the 'switch' leaving the case in place. Link: http://lkml.kernel.org/r/87zi6wstmw.fsf@notabene.neil.brown.name Fixes: ecc0c469f277 ("autofs: don't fail mount for transient error") Reported-by: Ben Hutchings Signed-off-by: NeilBrown Cc: Ian Kent Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit bbda4c57b91619642a94b193531312fe01bc2398 Author: Eric Biggers Date: Tue Nov 28 20:56:59 2017 -0800 crypto: salsa20 - fix blkcipher_walk API usage [ Upstream commit ecaaab5649781c5a0effdaf298a925063020500e ] When asked to encrypt or decrypt 0 bytes, both the generic and x86 implementations of Salsa20 crash in blkcipher_walk_done(), either when doing 'kfree(walk->buffer)' or 'free_page((unsigned long)walk->page)', because walk->buffer and walk->page have not been initialized. The bug is that Salsa20 is calling blkcipher_walk_done() even when nothing is in 'walk.nbytes'. But blkcipher_walk_done() is only meant to be called when a nonzero number of bytes have been provided. The broken code is part of an optimization that tries to make only one call to salsa20_encrypt_bytes() to process inputs that are not evenly divisible by 64 bytes. To fix the bug, just remove this "optimization" and use the blkcipher_walk API the same way all the other users do. Reproducer: #include #include #include int main() { int algfd, reqfd; struct sockaddr_alg addr = { .salg_type = "skcipher", .salg_name = "salsa20", }; char key[16] = { 0 }; algfd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(algfd, (void *)&addr, sizeof(addr)); reqfd = accept(algfd, 0, 0); setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key)); read(reqfd, key, sizeof(key)); } Reported-by: syzbot Fixes: eb6f13eb9f81 ("[CRYPTO] salsa20_generic: Fix multi-page processing") Cc: # v2.6.25+ Signed-off-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit bd7f57da8fff9b75204d6dd2b3ac6a30a6430a5c Author: Eric Biggers Date: Tue Nov 28 18:01:38 2017 -0800 crypto: hmac - require that the underlying hash algorithm is unkeyed [ Upstream commit af3ff8045bbf3e32f1a448542e73abb4c8ceb6f1 ] Because the HMAC template didn't check that its underlying hash algorithm is unkeyed, trying to use "hmac(hmac(sha3-512-generic))" through AF_ALG or through KEYCTL_DH_COMPUTE resulted in the inner HMAC being used without having been keyed, resulting in sha3_update() being called without sha3_init(), causing a stack buffer overflow. This is a very old bug, but it seems to have only started causing real problems when SHA-3 support was added (requires CONFIG_CRYPTO_SHA3) because the innermost hash's state is ->import()ed from a zeroed buffer, and it just so happens that other hash algorithms are fine with that, but SHA-3 is not. However, there could be arch or hardware-dependent hash algorithms also affected; I couldn't test everything. Fix the bug by introducing a function crypto_shash_alg_has_setkey() which tests whether a shash algorithm is keyed. Then update the HMAC template to require that its underlying hash algorithm is unkeyed. Here is a reproducer: #include #include int main() { int algfd; struct sockaddr_alg addr = { .salg_type = "hash", .salg_name = "hmac(hmac(sha3-512-generic))", }; char key[4096] = { 0 }; algfd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(algfd, (const struct sockaddr *)&addr, sizeof(addr)); setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key)); } Here was the KASAN report from syzbot: BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:341 [inline] BUG: KASAN: stack-out-of-bounds in sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161 Write of size 4096 at addr ffff8801cca07c40 by task syzkaller076574/3044 CPU: 1 PID: 3044 Comm: syzkaller076574 Not tainted 4.14.0-mm1+ #25 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x257 lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:252 kasan_report_error mm/kasan/report.c:351 [inline] kasan_report+0x25b/0x340 mm/kasan/report.c:409 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x137/0x190 mm/kasan/kasan.c:267 memcpy+0x37/0x50 mm/kasan/kasan.c:303 memcpy include/linux/string.h:341 [inline] sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161 crypto_shash_update+0xcb/0x220 crypto/shash.c:109 shash_finup_unaligned+0x2a/0x60 crypto/shash.c:151 crypto_shash_finup+0xc4/0x120 crypto/shash.c:165 hmac_finup+0x182/0x330 crypto/hmac.c:152 crypto_shash_finup+0xc4/0x120 crypto/shash.c:165 shash_digest_unaligned+0x9e/0xd0 crypto/shash.c:172 crypto_shash_digest+0xc4/0x120 crypto/shash.c:186 hmac_setkey+0x36a/0x690 crypto/hmac.c:66 crypto_shash_setkey+0xad/0x190 crypto/shash.c:64 shash_async_setkey+0x47/0x60 crypto/shash.c:207 crypto_ahash_setkey+0xaf/0x180 crypto/ahash.c:200 hash_setkey+0x40/0x90 crypto/algif_hash.c:446 alg_setkey crypto/af_alg.c:221 [inline] alg_setsockopt+0x2a1/0x350 crypto/af_alg.c:254 SYSC_setsockopt net/socket.c:1851 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1830 entry_SYSCALL_64_fastpath+0x1f/0x96 Reported-by: syzbot Cc: Signed-off-by: Eric Biggers Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 4e1911325f19114d781c9e8b14a3565d966bd64a Author: Vincent Pelletier Date: Sun Nov 26 06:52:53 2017 +0000 usb: gadget: ffs: Forbid usb_ep_alloc_request from sleeping [ Upstream commit 30bf90ccdec1da9c8198b161ecbff39ce4e5a9ba ] Found using DEBUG_ATOMIC_SLEEP while submitting an AIO read operation: [ 100.853642] BUG: sleeping function called from invalid context at mm/slab.h:421 [ 100.861148] in_atomic(): 1, irqs_disabled(): 1, pid: 1880, name: python [ 100.867954] 2 locks held by python/1880: [ 100.867961] #0: (&epfile->mutex){....}, at: [] ffs_mutex_lock+0x27/0x30 [usb_f_fs] [ 100.868020] #1: (&(&ffs->eps_lock)->rlock){....}, at: [] ffs_epfile_io.isra.17+0x24b/0x590 [usb_f_fs] [ 100.868076] CPU: 1 PID: 1880 Comm: python Not tainted 4.14.0-edison+ #118 [ 100.868085] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48 [ 100.868093] Call Trace: [ 100.868122] dump_stack+0x47/0x62 [ 100.868156] ___might_sleep+0xfd/0x110 [ 100.868182] __might_sleep+0x68/0x70 [ 100.868217] kmem_cache_alloc_trace+0x4b/0x200 [ 100.868248] ? dwc3_gadget_ep_alloc_request+0x24/0xe0 [dwc3] [ 100.868302] dwc3_gadget_ep_alloc_request+0x24/0xe0 [dwc3] [ 100.868343] usb_ep_alloc_request+0x16/0xc0 [udc_core] [ 100.868386] ffs_epfile_io.isra.17+0x444/0x590 [usb_f_fs] [ 100.868424] ? _raw_spin_unlock_irqrestore+0x27/0x40 [ 100.868457] ? kiocb_set_cancel_fn+0x57/0x60 [ 100.868477] ? ffs_ep0_poll+0xc0/0xc0 [usb_f_fs] [ 100.868512] ffs_epfile_read_iter+0xfe/0x157 [usb_f_fs] [ 100.868551] ? security_file_permission+0x9c/0xd0 [ 100.868587] ? rw_verify_area+0xac/0x120 [ 100.868633] aio_read+0x9d/0x100 [ 100.868692] ? __fget+0xa2/0xd0 [ 100.868727] ? __might_sleep+0x68/0x70 [ 100.868763] SyS_io_submit+0x471/0x680 [ 100.868878] do_int80_syscall_32+0x4e/0xd0 [ 100.868921] entry_INT80_32+0x2a/0x2a [ 100.868932] EIP: 0xb7fbb676 [ 100.868941] EFLAGS: 00000292 CPU: 1 [ 100.868951] EAX: ffffffda EBX: b7aa2000 ECX: 00000002 EDX: b7af8368 [ 100.868961] ESI: b7fbb660 EDI: b7aab000 EBP: bfb6c658 ESP: bfb6c638 [ 100.868973] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Signed-off-by: Vincent Pelletier Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit 72b01bee76591d3741a87a689bf210d729d530f8 Author: Eric Dumazet Date: Tue Nov 28 08:03:30 2017 -0800 net/packet: fix a race in packet_bind() and packet_notifier() [ Upstream commit 15fe076edea787807a7cdc168df832544b58eba6 ] syzbot reported crashes [1] and provided a C repro easing bug hunting. When/if packet_do_bind() calls __unregister_prot_hook() and releases po->bind_lock, another thread can run packet_notifier() and process an NETDEV_UP event. This calls register_prot_hook() and hooks again the socket right before first thread is able to grab again po->bind_lock. Fixes this issue by temporarily setting po->num to 0, as suggested by David Miller. [1] dev_remove_pack: ffff8801bf16fa80 not found ------------[ cut here ]------------ kernel BUG at net/core/dev.c:7945! ( BUG_ON(!list_empty(&dev->ptype_all)); ) invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: device syz0 entered promiscuous mode CPU: 0 PID: 3161 Comm: syzkaller404108 Not tainted 4.14.0+ #190 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801cc57a500 task.stack: ffff8801cc588000 RIP: 0010:netdev_run_todo+0x772/0xae0 net/core/dev.c:7945 RSP: 0018:ffff8801cc58f598 EFLAGS: 00010293 RAX: ffff8801cc57a500 RBX: dffffc0000000000 RCX: ffffffff841f75b2 RDX: 0000000000000000 RSI: 1ffff100398b1ede RDI: ffff8801bf1f8810 device syz0 entered promiscuous mode RBP: ffff8801cc58f898 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801bf1f8cd8 R13: ffff8801cc58f870 R14: ffff8801bf1f8780 R15: ffff8801cc58f7f0 FS: 0000000001716880(0000) GS:ffff8801db400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020b13000 CR3: 0000000005e25000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rtnl_unlock+0xe/0x10 net/core/rtnetlink.c:106 tun_detach drivers/net/tun.c:670 [inline] tun_chr_close+0x49/0x60 drivers/net/tun.c:2845 __fput+0x333/0x7f0 fs/file_table.c:210 ____fput+0x15/0x20 fs/file_table.c:244 task_work_run+0x199/0x270 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x9bb/0x1ae0 kernel/exit.c:865 do_group_exit+0x149/0x400 kernel/exit.c:968 SYSC_exit_group kernel/exit.c:979 [inline] SyS_exit_group+0x1d/0x20 kernel/exit.c:977 entry_SYSCALL_64_fastpath+0x1f/0x96 RIP: 0033:0x44ad19 Fixes: 30f7ea1c2b5f ("packet: race condition in packet_bind") Signed-off-by: Eric Dumazet Reported-by: syzbot Cc: Francesco Ruggeri Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4ccd91f02562785d51cbbd4546099ab9fdd7271b Author: Hangbin Liu Date: Thu Nov 30 10:41:14 2017 +0800 sit: update frag_off info [ Upstream commit f859b4af1c52493ec21173ccc73d0b60029b5b88 ] After parsing the sit netlink change info, we forget to update frag_off in ipip6_tunnel_update(). Fix it by assigning frag_off with new value. Reported-by: Jianlin Shi Signed-off-by: Hangbin Liu Acked-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit dcd241dca9507cf4b3980505e2482ed6aba347b5 Author: Håkon Bugge Date: Wed Dec 6 17:18:28 2017 +0100 rds: Fix NULL pointer dereference in __rds_rdma_map [ Upstream commit f3069c6d33f6ae63a1668737bc78aaaa51bff7ca ] This is a fix for syzkaller719569, where memory registration was attempted without any underlying transport being loaded. Analysis of the case reveals that it is the setsockopt() RDS_GET_MR (2) and RDS_GET_MR_FOR_DEST (7) that are vulnerable. Here is an example stack trace when the bug is hit: BUG: unable to handle kernel NULL pointer dereference at 00000000000000c0 IP: __rds_rdma_map+0x36/0x440 [rds] PGD 2f93d03067 P4D 2f93d03067 PUD 2f93d02067 PMD 0 Oops: 0000 [#1] SMP Modules linked in: bridge stp llc tun rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rds binfmt_misc sb_edac intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul c rc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd glue_helper cryptd iTCO_wdt mei_me sg iTCO_vendor_support ipmi_si mei ipmi_devintf nfsd shpchp pcspkr i2c_i801 ioatd ma ipmi_msghandler wmi lpc_ich mfd_core auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 mgag200 i2c_algo_bit drm_kms_helper ixgbe syscopyarea ahci sysfillrect sysimgblt libahci mdio fb_sys_fops ttm ptp libata sd_mod mlx4_core drm crc32c_intel pps_core megaraid_sas i2c_core dca dm_mirror dm_region_hash dm_log dm_mod CPU: 48 PID: 45787 Comm: repro_set2 Not tainted 4.14.2-3.el7uek.x86_64 #2 Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017 task: ffff882f9190db00 task.stack: ffffc9002b994000 RIP: 0010:__rds_rdma_map+0x36/0x440 [rds] RSP: 0018:ffffc9002b997df0 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff882fa2182580 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffc9002b997e40 RDI: ffff882fa2182580 RBP: ffffc9002b997e30 R08: 0000000000000000 R09: 0000000000000002 R10: ffff885fb29e3838 R11: 0000000000000000 R12: ffff882fa2182580 R13: ffff882fa2182580 R14: 0000000000000002 R15: 0000000020000ffc FS: 00007fbffa20b700(0000) GS:ffff882fbfb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000c0 CR3: 0000002f98a66006 CR4: 00000000001606e0 Call Trace: rds_get_mr+0x56/0x80 [rds] rds_setsockopt+0x172/0x340 [rds] ? __fget_light+0x25/0x60 ? __fdget+0x13/0x20 SyS_setsockopt+0x80/0xe0 do_syscall_64+0x67/0x1b0 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7fbff9b117f9 RSP: 002b:00007fbffa20aed8 EFLAGS: 00000293 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00000000000c84a4 RCX: 00007fbff9b117f9 RDX: 0000000000000002 RSI: 0000400000000114 RDI: 000000000000109b RBP: 00007fbffa20af10 R08: 0000000000000020 R09: 00007fbff9dd7860 R10: 0000000020000ffc R11: 0000000000000293 R12: 0000000000000000 R13: 00007fbffa20b9c0 R14: 00007fbffa20b700 R15: 0000000000000021 Code: 41 56 41 55 49 89 fd 41 54 53 48 83 ec 18 8b 87 f0 02 00 00 48 89 55 d0 48 89 4d c8 85 c0 0f 84 2d 03 00 00 48 8b 87 00 03 00 00 <48> 83 b8 c0 00 00 00 00 0f 84 25 03 00 0 0 48 8b 06 48 8b 56 08 The fix is to check the existence of an underlying transport in __rds_rdma_map(). Signed-off-by: Håkon Bugge Reported-by: syzbot Acked-by: Santosh Shilimkar Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0740244eaa5a8fd33f61c23d8ecc04907defa25e Author: Paul Moore Date: Fri Sep 1 09:44:34 2017 -0400 audit: ensure that 'audit=1' actually enables audit for PID 1 [ Upstream commit 173743dd99a49c956b124a74c8aacb0384739a4c ] Prior to this patch we enabled audit in audit_init(), which is too late for PID 1 as the standard initcalls are run after the PID 1 task is forked. This means that we never allocate an audit_context (see audit_alloc()) for PID 1 and therefore miss a lot of audit events generated by PID 1. This patch enables audit as early as possible to help ensure that when PID 1 is forked it can allocate an audit_context if required. Reviewed-by: Richard Guy Briggs Signed-off-by: Paul Moore Signed-off-by: Sasha Levin commit 1a7dc738e4e0bcdb386090e94f168b9db6274a4d Author: David Howells Date: Thu Nov 2 15:27:48 2017 +0000 afs: Connect up the CB.ProbeUuid [ Upstream commit f4b3526d83c40dd8bf5948b9d7a1b2c340f0dcc8 ] The handler for the CB.ProbeUuid operation in the cache manager is implemented, but isn't listed in the switch-statement of operation selection, so won't be used. Fix this by adding it. Signed-off-by: David Howells Signed-off-by: Sasha Levin commit 97370566e9bc8acba3aea9406c2d1bd51484337d Author: Majd Dibbiny Date: Mon Oct 30 14:23:13 2017 +0200 IB/mlx5: Assign send CQ and recv CQ of UMR QP [ Upstream commit 31fde034a8bd964a5c7c1a5663fc87a913158db2 ] The UMR's QP is created by calling mlx5_ib_create_qp directly, and therefore the send CQ and the recv CQ on the ibqp weren't assigned. Assign them right after calling the mlx5_ib_create_qp to assure that any access to those pointers will work as expected and won't crash the system as might happen as part of reset flow. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Majd Dibbiny Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin commit f2c5c3e03a7e355cfcb1a0faf1421a683e584261 Author: Mark Bloch Date: Thu Nov 2 15:22:26 2017 +0200 IB/mlx4: Increase maximal message size under UD QP [ Upstream commit 5f22a1d87c5315a98981ecf93cd8de226cffe6ca ] Maximal message should be used as a limit to the max message payload allowed, without the headers. The ConnectX-3 check is done against this value includes the headers. When the payload is 4K this will cause the NIC to drop packets. Increase maximal message to 8K as workaround, this shouldn't change current behaviour because we continue to set the MTU to 4k. To reproduce; set MTU to 4296 on the corresponding interface, for example: ifconfig eth0 mtu 4296 (both server and client) On server: ib_send_bw -c UD -d mlx4_0 -s 4096 -n 1000000 -i1 -m 4096 On client: ib_send_bw -d mlx4_0 -c UD -s 4096 -n 1000000 -i 1 -m 4096 Fixes: 6e0d733d9215 ("IB/mlx4: Allow 4K messages for UD QPs") Signed-off-by: Mark Bloch Reviewed-by: Majd Dibbiny Signed-off-by: Leon Romanovsky Signed-off-by: Doug Ledford Signed-off-by: Sasha Levin commit b486022b7bb8ccce0e44d43e77f15ed5f9f6cb68 Author: Herbert Xu Date: Fri Nov 10 14:14:06 2017 +1100 xfrm: Copy policy family in clone_policy [ Upstream commit 0e74aa1d79a5bbc663e03a2804399cae418a0321 ] The syzbot found an ancient bug in the IPsec code. When we cloned a socket policy (for example, for a child TCP socket derived from a listening socket), we did not copy the family field. This results in a live policy with a zero family field. This triggers a BUG_ON check in the af_key code when the cloned policy is retrieved. This patch fixes it by copying the family field over. Reported-by: syzbot Signed-off-by: Herbert Xu Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit 0770152f94381de6750af9bc809f8a5f81e4909a Author: Arvind Yadav Date: Tue Nov 14 13:42:38 2017 +0530 atm: horizon: Fix irq release error [ Upstream commit bde533f2ea607cbbbe76ef8738b36243939a7bc2 ] atm_dev_register() can fail here and passed parameters to free irq which is not initialised. Initialization of 'dev->irq' happened after the 'goto out_free_irq'. So using 'irq' insted of 'dev->irq' in free_irq(). Signed-off-by: Arvind Yadav Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 9cdc7a37fae11c28727a7455c6469c44bfd585a6 Author: Xin Long Date: Wed Nov 15 16:57:26 2017 +0800 sctp: use the right sk after waking up from wait_buf sleep [ Upstream commit cea0cc80a6777beb6eb643d4ad53690e1ad1d4ff ] Commit dfcb9f4f99f1 ("sctp: deny peeloff operation on asocs with threads sleeping on it") fixed the race between peeloff and wait sndbuf by checking waitqueue_active(&asoc->wait) in sctp_do_peeloff(). But it actually doesn't work, as even if waitqueue_active returns false the waiting sndbuf thread may still not yet hold sk lock. After asoc is peeled off, sk is not asoc->base.sk any more, then to hold the old sk lock couldn't make assoc safe to access. This patch is to fix this by changing to hold the new sk lock if sk is not asoc->base.sk, meanwhile, also set the sk in sctp_sendmsg with the new sk. With this fix, there is no more race between peeloff and waitbuf, the check 'waitqueue_active' in sctp_do_peeloff can be removed. Thanks Marcelo and Neil for making this clear. v1->v2: fix it by changing to lock the new sock instead of adding a flag in asoc. Suggested-by: Neil Horman Signed-off-by: Xin Long Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1b62d491c31fbd831ee3485157a3903f9445464b Author: Xin Long Date: Wed Nov 15 16:55:54 2017 +0800 sctp: do not free asoc when it is already dead in sctp_sendmsg [ Upstream commit ca3af4dd28cff4e7216e213ba3b671fbf9f84758 ] Now in sctp_sendmsg sctp_wait_for_sndbuf could schedule out without holding sock sk. It means the current asoc can be freed elsewhere, like when receiving an abort packet. If the asoc is just created in sctp_sendmsg and sctp_wait_for_sndbuf returns err, the asoc will be freed again due to new_asoc is not nil. An use-after-free issue would be triggered by this. This patch is to fix it by setting new_asoc with nil if the asoc is already dead when cpu schedules back, so that it will not be freed again in sctp_sendmsg. v1->v2: set new_asoc as nil in sctp_sendmsg instead of sctp_wait_for_sndbuf. Suggested-by: Neil Horman Reported-by: Dmitry Vyukov Signed-off-by: Xin Long Acked-by: Neil Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6b015b715200452c0967eae5f1cdcf05e92611d4 Author: Pavel Tatashin Date: Wed Nov 15 17:36:18 2017 -0800 sparc64/mm: set fields in deferred pages [ Upstream commit 2a20aa171071a334d80c4e5d5af719d8374702fc ] Without deferred struct page feature (CONFIG_DEFERRED_STRUCT_PAGE_INIT), flags and other fields in "struct page"es are never changed prior to first initializing struct pages by going through __init_single_page(). With deferred struct page feature enabled there is a case where we set some fields prior to initializing: mem_init() { register_page_bootmem_info(); free_all_bootmem(); ... } When register_page_bootmem_info() is called only non-deferred struct pages are initialized. But, this function goes through some reserved pages which might be part of the deferred, and thus are not yet initialized. mem_init register_page_bootmem_info register_page_bootmem_info_node get_page_bootmem .. setting fields here .. such as: page->freelist = (void *)type; free_all_bootmem() free_low_memory_core_early() for_each_reserved_mem_region() reserve_bootmem_region() init_reserved_page() <- Only if this is deferred reserved page __init_single_pfn() __init_single_page() memset(0) <-- Loose the set fields here We end up with similar issue as in the previous patch, where currently we do not observe problem as memory is zeroed. But, if flag asserts are changed we can start hitting issues. Also, because in this patch series we will stop zeroing struct page memory during allocation, we must make sure that struct pages are properly initialized prior to using them. The deferred-reserved pages are initialized in free_all_bootmem(). Therefore, the fix is to switch the above calls. Link: http://lkml.kernel.org/r/20171013173214.27300-4-pasha.tatashin@oracle.com Signed-off-by: Pavel Tatashin Reviewed-by: Steven Sistare Reviewed-by: Daniel Jordan Reviewed-by: Bob Picco Acked-by: David S. Miller Acked-by: Michal Hocko Cc: Alexander Potapenko Cc: Andrey Ryabinin Cc: Ard Biesheuvel Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Dmitry Vyukov Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Mark Rutland Cc: Matthew Wilcox Cc: Mel Gorman Cc: Michal Hocko Cc: Sam Ravnborg Cc: Thomas Gleixner Cc: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit c9fb7628f9c8a12c698558fbf7b62ee17650f780 Author: Chuck Lever Date: Fri Nov 3 13:46:06 2017 -0400 sunrpc: Fix rpc_task_begin trace point [ Upstream commit b2bfe5915d5fe7577221031a39ac722a0a2a1199 ] The rpc_task_begin trace point always display a task ID of zero. Move the trace point call site so that it picks up the new task ID. Signed-off-by: Chuck Lever Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit 3c2c40b46e8586b37f4b76e1ba570d40108b2ee0 Author: Trond Myklebust Date: Mon Nov 6 15:28:04 2017 -0500 NFS: Fix a typo in nfs_rename() [ Upstream commit d803224c84be067754db7fa58a93f36f61566493 ] On successful rename, the "old_dentry" is retained and is attached to the "new_dir", so we need to call nfs_set_verifier() accordingly. Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit bf48b8e1f2f6ac7937a058c3cc5b8b666dcdf7cf Author: Randy Dunlap Date: Fri Nov 17 15:27:35 2017 -0800 dynamic-debug-howto: fix optional/omitted ending line number to be LARGE instead of 0 [ Upstream commit 1f3c790bd5989fcfec9e53ad8fa09f5b740c958f ] line-range is supposed to treat "1-" as "1-endoffile", so handle the special case by setting last_lineno to UINT_MAX. Fixes this error: dynamic_debug:ddebug_parse_query: last-line:0 < 1st-line:1 dynamic_debug:ddebug_exec_query: query parse failed Link: http://lkml.kernel.org/r/10a6a101-e2be-209f-1f41-54637824788e@infradead.org Signed-off-by: Randy Dunlap Acked-by: Jason Baron Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 3888f4ba09bdb5177b011ff7fbd431e8545395f8 Author: Stephen Bates Date: Fri Nov 17 15:28:16 2017 -0800 lib/genalloc.c: make the avail variable an atomic_long_t [ Upstream commit 36a3d1dd4e16bcd0d2ddfb4a2ec7092f0ae0d931 ] If the amount of resources allocated to a gen_pool exceeds 2^32 then the avail atomic overflows and this causes problems when clients try and borrow resources from the pool. This is only expected to be an issue on 64 bit systems. Add the header to pull in atomic_long* operations. So that 32 bit systems continue to use atomic32_t but 64 bit systems can use atomic64_t. Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com Signed-off-by: Stephen Bates Reviewed-by: Logan Gunthorpe Reviewed-by: Mathieu Desnoyers Reviewed-by: Daniel Mentz Cc: Jonathan Corbet Cc: Andrew Morton Cc: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 3e3780098e9ffac153347f6bf6dabf0f3d20b135 Author: Xin Long Date: Fri Nov 17 14:27:06 2017 +0800 route: update fnhe_expires for redirect when the fnhe exists [ Upstream commit e39d5246111399dbc6e11cd39fd8580191b86c47 ] Now when creating fnhe for redirect, it sets fnhe_expires for this new route cache. But when updating the exist one, it doesn't do it. It will cause this fnhe never to be expired. Paolo already noticed it before, in Jianlin's test case, it became even worse: When ip route flush cache, the old fnhe is not to be removed, but only clean it's members. When redirect comes again, this fnhe will be found and updated, but never be expired due to fnhe_expires not being set. So fix it by simply updating fnhe_expires even it's for redirect. Fixes: aee06da6726d ("ipv4: use seqlock for nh_exceptions") Reported-by: Jianlin Shi Acked-by: Hannes Frederic Sowa Signed-off-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 2f52207aba567a9e02e739e72c904ffd5edb042f Author: Xin Long Date: Fri Nov 17 14:27:18 2017 +0800 route: also update fnhe_genid when updating a route cache [ Upstream commit cebe84c6190d741045a322f5343f717139993c08 ] Now when ip route flush cache and it turn out all fnhe_genid != genid. If a redirect/pmtu icmp packet comes and the old fnhe is found and all it's members but fnhe_genid will be updated. Then next time when it looks up route and tries to rebind this fnhe to the new dst, the fnhe will be flushed due to fnhe_genid != genid. It causes this redirect/pmtu icmp packet acutally not to be applied. This patch is to also reset fnhe_genid when updating a route cache. Fixes: 5aad1de5ea2c ("ipv4: use separate genid for next hop exceptions") Acked-by: Hannes Frederic Sowa Signed-off-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit bdc95e2f8a07c9f6c27a123de732bb15293ff7d0 Author: Ben Hutchings Date: Fri Nov 10 18:48:50 2017 +0000 mac80211_hwsim: Fix memory leak in hwsim_new_radio_nl() [ Upstream commit 67bd52386125ce1159c0581cbcd2740addf33cd4 ] hwsim_new_radio_nl() now copies the name attribute in order to add a null-terminator. mac80211_hwsim_new_radio() (indirectly) copies it again into the net_device structure, so the first copy is not used or freed later. Free the first copy before returning. Fixes: ff4dd73dd2b4 ("mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length") Signed-off-by: Ben Hutchings Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit e5a85e30e2926626290921aa7c624ed6befdcefc Author: Jérémy Lefaure Date: Wed Jun 28 20:57:29 2017 -0400 EDAC, i5000, i5400: Fix definition of NRECMEMB register [ Upstream commit a8c8261425649da58bdf08221570e5335ad33a31 ] In the i5000 and i5400 drivers, the NRECMEMB register is defined as a 16-bit value, which results in wrong shifts in the code, as reported by sparse. In the datasheets ([1], section 3.9.22.20 and [2], section 3.9.22.21), this register is a 32-bit register. A u32 value for the register fixes the wrong shifts warnings and matches the datasheet. Also fix the mask to access to the CAS bits [27:16] in the i5000 driver. [1]: https://www.intel.com/content/dam/doc/datasheet/5000p-5000v-5000z-chipset-memory-controller-hub-datasheet.pdf [2]: https://www.intel.se/content/dam/doc/datasheet/5400-chipset-memory-controller-hub-datasheet.pdf Signed-off-by: Jérémy Lefaure Cc: linux-edac Link: http://lkml.kernel.org/r/20170629005729.8478-1-jeremy.lefaure@lse.epita.fr Signed-off-by: Borislav Petkov Signed-off-by: Sasha Levin commit a838aec6b78acf4f5a9d61463bfe288d5dfe28c6 Author: Jérémy Lefaure Date: Wed Mar 8 20:18:09 2017 -0500 EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro [ Upstream commit e61555c29c28a4a3b6ba6207f4a0883ee236004d ] The MTR_DRAM_WIDTH macro returns the data width. It is sometimes used as if it returned a boolean true if the width if 8. Fix the tests where MTR_DRAM_WIDTH is misused. Signed-off-by: Jérémy Lefaure Cc: linux-edac Link: http://lkml.kernel.org/r/20170309011809.8340-1-jeremy.lefaure@lse.epita.fr Signed-off-by: Borislav Petkov Signed-off-by: Sasha Levin commit ce0872ea1a1302c7c4d2f73d19a83b5b1c8d5adb Author: Jan Kara Date: Wed Mar 8 14:56:05 2017 +0100 axonram: Fix gendisk handling [ Upstream commit 672a2c87c83649fb0167202342ce85af9a3b4f1c ] It is invalid to call del_gendisk() when disk->queue is NULL. Fix error handling in axon_ram_probe() to avoid doing that. Also del_gendisk() does not drop a reference to gendisk allocated by alloc_disk(). That has to be done by put_disk(). Add that call where needed. Reported-by: Dan Carpenter Signed-off-by: Jan Kara Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 682a37f2ac67b0dbb31c22c697cc0d13338a6e9e Author: Chris Brandt Date: Mon Mar 6 15:20:51 2017 -0500 i2c: riic: fix restart condition [ Upstream commit 2501c1bb054290679baad0ff7f4f07c714251f4c ] While modifying the driver to use the STOP interrupt, the completion of the intermediate transfers need to wake the driver back up in order to initiate the next transfer (restart condition). Otherwise you get never ending interrupts and only the first transfer sent. Fixes: 71ccea095ea1 ("i2c: riic: correctly finish transfers") Reported-by: Simon Horman Signed-off-by: Chris Brandt Tested-by: Simon Horman Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin commit 2a89161286f919d61c52110850e9c9e199010321 Author: Krzysztof Kozlowski Date: Sun Mar 5 19:14:07 2017 +0200 crypto: s5p-sss - Fix completing crypto request in IRQ handler [ Upstream commit 07de4bc88ce6a4d898cad9aa4c99c1df7e87702d ] In a regular interrupt handler driver was finishing the crypt/decrypt request by calling complete on crypto request. This is disallowed since converting to skcipher in commit b286d8b1a690 ("crypto: skcipher - Add skcipher walk interface") and causes a warning: WARNING: CPU: 0 PID: 0 at crypto/skcipher.c:430 skcipher_walk_first+0x13c/0x14c The interrupt is marked shared but in fact there are no other users sharing it. Thus the simplest solution seems to be to just use a threaded interrupt handler, after converting it to oneshot. Signed-off-by: Krzysztof Kozlowski Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 2c9349827832c48ad6aa02add4e0cb41b6522b79 Author: WANG Cong Date: Sun Mar 5 12:34:53 2017 -0800 ipv6: reorder icmpv6_init() and ip6_mr_init() [ Upstream commit 15e668070a64bb97f102ad9cf3bccbca0545cda8 ] Andrey reported the following kernel crash: kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 14446 Comm: syz-executor6 Not tainted 4.10.0+ #82 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88001f311700 task.stack: ffff88001f6e8000 RIP: 0010:ip6mr_sk_done+0x15a/0x3d0 net/ipv6/ip6mr.c:1618 RSP: 0018:ffff88001f6ef418 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 1ffff10003edde8c RCX: ffffc900043ee000 RDX: 0000000000000004 RSI: ffffffff83e3b3f8 RDI: 0000000000000020 RBP: ffff88001f6ef508 R08: fffffbfff0dcc5d8 R09: 0000000000000000 R10: ffffffff86e62ec0 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: ffff88001f6ef4e0 R15: ffff8800380a0040 FS: 00007f7a52cec700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000061c500 CR3: 000000001f1ae000 CR4: 00000000000006f0 DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Call Trace: rawv6_close+0x4c/0x80 net/ipv6/raw.c:1217 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:432 sock_release+0x8d/0x1e0 net/socket.c:597 __sock_create+0x39d/0x880 net/socket.c:1226 sock_create_kern+0x3f/0x50 net/socket.c:1243 inet_ctl_sock_create+0xbb/0x280 net/ipv4/af_inet.c:1526 icmpv6_sk_init+0x163/0x500 net/ipv6/icmp.c:954 ops_init+0x10a/0x550 net/core/net_namespace.c:115 setup_net+0x261/0x660 net/core/net_namespace.c:291 copy_net_ns+0x27e/0x540 net/core/net_namespace.c:396 9pnet_virtio: no channels available for device ./file1 create_new_namespaces+0x437/0x9b0 kernel/nsproxy.c:106 unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205 SYSC_unshare kernel/fork.c:2281 [inline] SyS_unshare+0x64e/0x1000 kernel/fork.c:2231 entry_SYSCALL_64_fastpath+0x1f/0xc2 This is because net->ipv6.mr6_tables is not initialized at that point, ip6mr_rules_init() is not called yet, therefore on the error path when we iterator the list, we trigger this oops. Fix this by reordering ip6mr_rules_init() before icmpv6_sk_init(). Reported-by: Andrey Konovalov Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit bec0486f2e3bb92e9372ec3bda987de095255541 Author: Michal Schmidt Date: Fri Mar 3 17:08:30 2017 +0100 bnx2x: fix possible overrun of VFPF multicast addresses array [ Upstream commit 22118d861cec5da6ed525aaf12a3de9bfeffc58f ] It is too late to check for the limit of the number of VF multicast addresses after they have already been copied to the req->multicast[] array, possibly overflowing it. Do the check before copying. Also fix the error path to not skip unlocking vf2pf_mutex. Signed-off-by: Michal Schmidt Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a713e366738922fadfa782fa8eb31b1d4f6cb6f4 Author: Michal Schmidt Date: Fri Mar 3 17:08:28 2017 +0100 bnx2x: prevent crash when accessing PTP with interface down [ Upstream commit 466e8bf10ac104d96e1ea813e8126e11cb72ea20 ] It is possible to crash the kernel by accessing a PTP device while its associated bnx2x interface is down. Before the interface is brought up, the timecounter is not initialized, so accessing it results in NULL dereference. Fix it by checking if the interface is up. Use -ENETDOWN as the error code when the interface is down. -EFAULT in bnx2x_ptp_adjfreq() did not seem right. Tested using phc_ctl get/set/adj/freq commands. Signed-off-by: Michal Schmidt Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 8fdb631e3e3406433af007116d84f19ad1e6f096 Author: Blomme, Maarten Date: Thu Mar 2 13:08:36 2017 +0100 spi_ks8995: fix "BUG: key accdaa28 not in .data!" [ Upstream commit 4342696df764ec65dcdfbd0c10d90ea52505f8ba ] Signed-off-by: Maarten Blomme Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f1c73d4ed0778d765e808144d987b12f31bd29b2 Author: Mark Rutland Date: Mon Feb 20 12:30:11 2017 +0000 arm: KVM: Survive unknown traps from guests [ Upstream commit f050fe7a9164945dd1c28be05bf00e8cfb082ccf ] Currently we BUG() if we see a HSR.EC value we don't recognise. As configurable disables/enables are added to the architecture (controlled by RES1/RES0 bits respectively), with associated synchronous exceptions, it may be possible for a guest to trigger exceptions with classes that we don't recognise. While we can't service these exceptions in a manner useful to the guest, we can avoid bringing down the host. Per ARM DDI 0406C.c, all currently unallocated HSR EC encodings are reserved, and per ARM DDI 0487A.k_iss10775, page G6-4395, EC values within the range 0x00 - 0x2c are reserved for future use with synchronous exceptions, and EC values within the range 0x2d - 0x3f may be used for either synchronous or asynchronous exceptions. The patch makes KVM handle any unknown EC by injecting an UNDEFINED exception into the guest, with a corresponding (ratelimited) warning in the host dmesg. We could later improve on this with with a new (opt-in) exit to the host userspace. Cc: Dave Martin Cc: Suzuki K Poulose Reviewed-by: Christoffer Dall Signed-off-by: Mark Rutland Signed-off-by: Marc Zyngier Signed-off-by: Sasha Levin commit 38ae8173704dbdae198173b7f3107aec375ab4c1 Author: Wanpeng Li Date: Mon Mar 6 04:03:28 2017 -0800 KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset [ Upstream commit 2f707d97982286b307ef2a9b034e19aabc1abb56 ] Reported by syzkaller: WARNING: CPU: 1 PID: 27742 at arch/x86/kvm/vmx.c:11029 nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029 CPU: 1 PID: 27742 Comm: a.out Not tainted 4.10.0+ #229 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 panic+0x1fb/0x412 kernel/panic.c:179 __warn+0x1c4/0x1e0 kernel/panic.c:540 warn_slowpath_null+0x2c/0x40 kernel/panic.c:583 nested_vmx_vmexit+0x5c35/0x74d0 arch/x86/kvm/vmx.c:11029 vmx_leave_nested arch/x86/kvm/vmx.c:11136 [inline] vmx_set_msr+0x1565/0x1910 arch/x86/kvm/vmx.c:3324 kvm_set_msr+0xd4/0x170 arch/x86/kvm/x86.c:1099 do_set_msr+0x11e/0x190 arch/x86/kvm/x86.c:1128 __msr_io arch/x86/kvm/x86.c:2577 [inline] msr_io+0x24b/0x450 arch/x86/kvm/x86.c:2614 kvm_arch_vcpu_ioctl+0x35b/0x46a0 arch/x86/kvm/x86.c:3497 kvm_vcpu_ioctl+0x232/0x1120 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2721 vfs_ioctl fs/ioctl.c:43 [inline] do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683 SYSC_ioctl fs/ioctl.c:698 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689 entry_SYSCALL_64_fastpath+0x1f/0xc2 The syzkaller folks reported a nested_run_pending warning during userspace clear VMX capability which is exposed to L1 before. The warning gets thrown while doing (*(uint32_t*)0x20aecfe8 = (uint32_t)0x1); (*(uint32_t*)0x20aecfec = (uint32_t)0x0); (*(uint32_t*)0x20aecff0 = (uint32_t)0x3a); (*(uint32_t*)0x20aecff4 = (uint32_t)0x0); (*(uint64_t*)0x20aecff8 = (uint64_t)0x0); r[29] = syscall(__NR_ioctl, r[4], 0x4008ae89ul, 0x20aecfe8ul, 0, 0, 0, 0, 0, 0); i.e. KVM_SET_MSR ioctl with struct kvm_msrs { .nmsrs = 1, .pad = 0, .entries = { {.index = MSR_IA32_FEATURE_CONTROL, .reserved = 0, .data = 0} } } The VMLANCH/VMRESUME emulation should be stopped since the CPU is going to reset here. This patch resets the nested_run_pending since the CPU is going to be reset hence there should be nothing pending. Reported-by: Dmitry Vyukov Suggested-by: Radim Krčmář Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Dmitry Vyukov Cc: David Hildenbrand Signed-off-by: Wanpeng Li Reviewed-by: David Hildenbrand Reviewed-by: Jim Mattson Signed-off-by: Radim Krčmář Signed-off-by: Sasha Levin commit a28fd13338eac8b47942e6b2bebfd1cee9f4aae8 Author: Franck Demathieu Date: Mon Mar 6 14:41:06 2017 +0100 irqchip/crossbar: Fix incorrect type of register size [ Upstream commit 4b9de5da7e120c7f02395da729f0ec77ce7a6044 ] The 'size' variable is unsigned according to the dt-bindings. As this variable is used as integer in other places, create a new variable that allows to fix the following sparse issue (-Wtypesign): drivers/irqchip/irq-crossbar.c:279:52: warning: incorrect type in argument 3 (different signedness) drivers/irqchip/irq-crossbar.c:279:52: expected unsigned int [usertype] *out_value drivers/irqchip/irq-crossbar.c:279:52: got int * Signed-off-by: Franck Demathieu Signed-off-by: Marc Zyngier Signed-off-by: Sasha Levin commit a33fe2bcd33d42088aba4778a249ef83e6a3601a Author: James Smart Date: Sat Mar 4 09:30:25 2017 -0800 scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters [ Upstream commit 5d181531bc6169e19a02a27d202cf0e982db9d0e ] if REG_VPI fails, the driver was incorrectly issuing INIT_VFI (a SLI4 command) on a SLI3 adapter. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 6b4d70f553acd2203608eb27858dba88ad68b691 Author: Tejun Heo Date: Mon Mar 6 15:33:42 2017 -0500 workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq [ Upstream commit 637fdbae60d6cb9f6e963c1079d7e0445c86ff7d ] If queue_delayed_work() gets called with NULL @wq, the kernel will oops asynchronuosly on timer expiration which isn't too helpful in tracking down the offender. This actually happened with smc. __queue_delayed_work() already does several input sanity checks synchronously. Add NULL @wq check. Reported-by: Dave Jones Link: http://lkml.kernel.org/r/20170227171439.jshx3qplflyrgcv7@codemonkey.org.uk Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin commit af0e7947de8c4b80e55ebd8bf96b6d795eca9cc3 Author: Tejun Heo Date: Mon Mar 6 15:26:54 2017 -0500 libata: drop WARN from protocol error in ata_sff_qc_issue() [ Upstream commit 0580b762a4d6b70817476b90042813f8573283fa ] ata_sff_qc_issue() expects upper layers to never issue commands on a command protocol that it doesn't implement. While the assumption holds fine with the usual IO path, nothing filters based on the command protocol in the passthrough path (which was added later), allowing the warning to be tripped with a passthrough command with the right (well, wrong) protocol. Failing with AC_ERR_SYSTEM is the right thing to do anyway. Remove the unnecessary WARN. Reported-by: Dmitry Vyukov Link: http://lkml.kernel.org/r/CACT4Y+bXkvevNZU8uP6X0QVqsj6wNoUA_1exfTSOzc+SmUtMOA@mail.gmail.com Signed-off-by: Tejun Heo Signed-off-by: Sasha Levin commit 1c33170076ba1e16f6cb4021e3c76f485dea39cc Author: Christophe JAILLET Date: Tue Feb 21 22:33:11 2017 +0100 USB: gadgetfs: Fix a potential memory leak in 'dev_config()' [ Upstream commit b6e7aeeaf235901c42ec35de4633c7c69501d303 ] 'kbuf' is allocated just a few lines above using 'memdup_user()'. If the 'if (dev->buf)' test fails, this memory is never released. Signed-off-by: Christophe JAILLET Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit 37177405c4afb7967cd091946be37ed2c94ff339 Author: John Keeping Date: Tue Feb 28 10:55:30 2017 +0000 usb: gadget: configs: plug memory leak [ Upstream commit 38355b2a44776c25b0f2ad466e8c51bb805b3032 ] When binding a gadget to a device, "name" is stored in gi->udc_name, but this does not happen when unregistering and the string is leaked. Signed-off-by: John Keeping Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin commit 1792ffc3256cc910c9a016a763bfe240957c5b9b Author: David Daney Date: Wed Mar 1 14:04:53 2017 -0800 module: set __jump_table alignment to 8 [ Upstream commit ab42632156becd35d3884ee5c14da2bedbf3149a ] For powerpc the __jump_table section in modules is not aligned, this causes a WARN_ON() splat when loading a module containing a __jump_table. Strict alignment became necessary with commit 3821fd35b58d ("jump_label: Reduce the size of struct static_key"), currently in linux-next, which uses the two least significant bits of pointers to __jump_table elements. Fix by forcing __jump_table to 8, which is the same alignment used for this section in the kernel proper. Link: http://lkml.kernel.org/r/20170301220453.4756-1-david.daney@cavium.com Reviewed-by: Jason Baron Acked-by: Jessica Yu Acked-by: Michael Ellerman (powerpc) Tested-by: Sachin Sant Signed-off-by: David Daney Signed-off-by: Steven Rostedt (VMware) Signed-off-by: Sasha Levin commit 0e1ac5b7b4619345e87971f798c141d6e9ae74bc Author: Sachin Sant Date: Sun Feb 26 11:38:39 2017 +0530 selftest/powerpc: Fix false failures for skipped tests [ Upstream commit a6d8a21596df041f36f4c2ccc260c459e3e851f1 ] Tests under alignment subdirectory are skipped when executed on previous generation hardware, but harness still marks them as failed. test: test_copy_unaligned tags: git_version:unknown [SKIP] Test skipped on line 26 skip: test_copy_unaligned selftests: copy_unaligned [FAIL] The MAGIC_SKIP_RETURN_VALUE value assigned to rc variable is retained till the program exit which causes the test to be marked as failed. This patch resets the value before returning to the main() routine. With this patch the test o/p is as follows: test: test_copy_unaligned tags: git_version:unknown [SKIP] Test skipped on line 26 skip: test_copy_unaligned selftests: copy_unaligned [PASS] Signed-off-by: Sachin Sant Signed-off-by: Michael Ellerman Signed-off-by: Sasha Levin commit 6500fffd7f0978a35b9d1188b4064483c46d4525 Author: Ladislav Michl Date: Sat Feb 11 14:02:49 2017 +0100 ARM: OMAP2+: gpmc-onenand: propagate error on initialization failure [ Upstream commit 7807e086a2d1f69cc1a57958cac04fea79fc2112 ] gpmc_probe_onenand_child returns success even on gpmc_onenand_init failure. Fix that. Signed-off-by: Ladislav Michl Acked-by: Roger Quadros Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin commit ef14323b7bb53a532bec39d724a1fdd52f0da5a4 Author: Steffen Klassert Date: Wed Feb 15 11:38:58 2017 +0100 vti6: Don't report path MTU below IPV6_MIN_MTU. [ Upstream commit e3dc847a5f85b43ee2bfc8eae407a7e383483228 ] In vti6_xmit(), the check for IPV6_MIN_MTU before we send a ICMPV6_PKT_TOOBIG message is missing. So we might report a PMTU below 1280. Fix this by adding the required check. Fixes: ccd740cbc6e ("vti6: Add pmtu handling to vti6_xmit.") Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit 64fc772bc598e0bbd5222b5a41d3d8bdabf7bc0b Author: Stephen Hemminger Date: Tue Mar 7 09:15:53 2017 -0800 scsi: storvsc: Workaround for virtual DVD SCSI version [ Upstream commit f1c635b439a5c01776fe3a25b1e2dc546ea82e6f ] Hyper-V host emulation of SCSI for virtual DVD device reports SCSI version 0 (UNKNOWN) but is still capable of supporting REPORTLUN. Without this patch, a GEN2 Linux guest on Hyper-V will not boot 4.11 successfully with virtual DVD ROM device. What happens is that the SCSI scan process falls back to doing sequential probing by INQUIRY. But the storvsc driver has a previous workaround that masks/blocks all errors reports from INQUIRY (or MODE_SENSE) commands. This workaround causes the scan to then populate a full set of bogus LUN's on the target and then sends kernel spinning off into a death spiral doing block reads on the non-existent LUNs. By setting the correct blacklist flags, the target with the DVD device is scanned with REPORTLUN and that works correctly. Patch needs to go in current 4.11, it is safe but not necessary in older kernels. Signed-off-by: Stephen Hemminger Reviewed-by: K. Y. Srinivasan Reviewed-by: Christoph Hellwig Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 163b3139e68906a91fe4d924c2705383c712774f Author: Dave Martin Date: Tue Dec 5 14:56:42 2017 +0000 arm64: fpsimd: Prevent registers leaking from dead tasks [ Upstream commit 071b6d4a5d343046f253a5a8835d477d93992002 ] Currently, loading of a task's fpsimd state into the CPU registers is skipped if that task's state is already present in the registers of that CPU. However, the code relies on the struct fpsimd_state * (and by extension struct task_struct *) to unambiguously identify a task. There is a particular case in which this doesn't work reliably: when a task exits, its task_struct may be recycled to describe a new task. Consider the following scenario: 1) Task P loads its fpsimd state onto cpu C. per_cpu(fpsimd_last_state, C) := P; P->thread.fpsimd_state.cpu := C; 2) Task X is scheduled onto C and loads its fpsimd state on C. per_cpu(fpsimd_last_state, C) := X; X->thread.fpsimd_state.cpu := C; 3) X exits, causing X's task_struct to be freed. 4) P forks a new child T, which obtains X's recycled task_struct. T == X. T->thread.fpsimd_state.cpu == C (inherited from P). 5) T is scheduled on C. T's fpsimd state is not loaded, because per_cpu(fpsimd_last_state, C) == T (== X) && T->thread.fpsimd_state.cpu == C. (This is the check performed by fpsimd_thread_switch().) So, T gets X's registers because the last registers loaded onto C were those of X, in (2). This patch fixes the problem by ensuring that the sched-in check fails in (5): fpsimd_flush_task_state(T) is called when T is forked, so that T->thread.fpsimd_state.cpu == C cannot be true. This relies on the fact that T is not schedulable until after copy_thread() completes. Once T's fpsimd state has been loaded on some CPU C there may still be other cpus D for which per_cpu(fpsimd_last_state, D) == &X->thread.fpsimd_state. But D is necessarily != C in this case, and the check in (5) must fail. An alternative fix would be to do refcounting on task_struct. This would result in each CPU holding a reference to the last task whose fpsimd state was loaded there. It's not clear whether this is preferable, and it involves higher overhead than the fix proposed in this patch. It would also move all the task_struct freeing work into the context switch critical section, or otherwise some deferred cleanup mechanism would need to be introduced, neither of which seems obviously justified. Cc: Fixes: 005f78cd8849 ("arm64: defer reloading a task's FPSIMD state to userland resume") Signed-off-by: Dave Martin Reviewed-by: Ard Biesheuvel [will: word-smithed the comment so it makes more sense] Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 6552b7695ca65e6ca412948d4aa0179df69dbc1d Author: Andrew Honig Date: Fri Dec 1 10:21:09 2017 -0800 KVM: VMX: remove I/O port 0x80 bypass on Intel hosts [ Upstream commit d59d51f088014f25c2562de59b9abff4f42a7468 ] This fixes CVE-2017-1000407. KVM allows guests to directly access I/O port 0x80 on Intel hosts. If the guest floods this port with writes it generates exceptions and instability in the host kernel, leading to a crash. With this change guest writes to port 0x80 on Intel will behave the same as they currently behave on AMD systems. Prevent the flooding by removing the code that sets port 0x80 as a passthrough port. This is essentially the same as upstream patch 99f85a28a78e96d28907fe036e1671a218fee597, except that patch was for AMD chipsets and this patch is for Intel. Signed-off-by: Andrew Honig Signed-off-by: Jim Mattson Fixes: fdef3ad1b386 ("KVM: VMX: Enable io bitmaps to avoid IO port 0x80 VMEXITs") Cc: Signed-off-by: Radim Krčmář Signed-off-by: Sasha Levin commit ca59a0466d51aec281701ba2123d5418afa137f4 Author: Kristina Martsenko Date: Thu Nov 16 17:58:20 2017 +0000 arm64: KVM: fix VTTBR_BADDR_MASK BUG_ON off-by-one [ Upstream commit 26aa7b3b1c0fb3f1a6176a0c1847204ef4355693 ] VTTBR_BADDR_MASK is used to sanity check the size and alignment of the VTTBR address. It seems to currently be off by one, thereby only allowing up to 47-bit addresses (instead of 48-bit) and also insufficiently checking the alignment. This patch fixes it. As an example, with 4k pages, before this patch we have: PHYS_MASK_SHIFT = 48 VTTBR_X = 37 - 24 = 13 VTTBR_BADDR_SHIFT = 13 - 1 = 12 VTTBR_BADDR_MASK = ((1 << 35) - 1) << 12 = 0x00007ffffffff000 Which is wrong, because the mask doesn't allow bit 47 of the VTTBR address to be set, and only requires the address to be 12-bit (4k) aligned, while it actually needs to be 13-bit (8k) aligned because we concatenate two 4k tables. With this patch, the mask becomes 0x0000ffffffffe000, which is what we want. Fixes: 0369f6a34b9f ("arm64: KVM: EL2 register definitions") Cc: # 3.11.x Reviewed-by: Suzuki K Poulose Reviewed-by: Christoffer Dall Signed-off-by: Kristina Martsenko Signed-off-by: Marc Zyngier Signed-off-by: Christoffer Dall Signed-off-by: Sasha Levin commit 31bf7fd58b40cd1c0dc787c17885d8acf773b689 Author: Laurent Caumont Date: Sat Nov 11 12:44:46 2017 -0500 media: dvb: i2c transfers over usb cannot be done from stack [ Upstream commit b4756707152700c96acdfe149cb1ca4cec306c7a ] Since commit 29d2fef8be11 ("usb: catch attempts to submit urbs with a vmalloc'd transfer buffer"), the AverMedia AverTV DVB-T USB 2.0 (a800) fails to probe. Cc: stable@vger.kernel.org Signed-off-by: Sean Young Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 336200ecc7442e99d176e64bed270a1ee12ef30a Author: Dave Gordon Date: Thu Aug 18 18:17:22 2016 +0100 drm: extra printk() wrapper macros [ Upstream commit 30b0da8d556e65ff935a56cd82c05ba0516d3e4a ] We had only DRM_INFO() and DRM_ERROR(), whereas the underlying printk() provides several other useful intermediate levels such as NOTICE and WARNING. So this patch fills out the set by providing both regular and once-only macros for each of the levels INFO, NOTICE, and WARNING, using a common underlying macro that does all the token-pasting. DRM_ERROR is unchanged, as it's not just a printk wrapper. v2: Fix whitespace, missing ## (Eric Engestrom) Signed-off-by: Dave Gordon Reviewed-by: Eric Engestrom Cc: dri-devel@lists.freedesktop.org Acked-by: Dave Airlie Signed-off-by: Tvrtko Ursulin Signed-off-by: Sasha Levin commit 7ccf6a41095507186c4fff0cad8de381a530ac3b Author: Daniel Thompson Date: Mon Mar 2 14:13:36 2015 +0000 kdb: Fix handling of kallsyms_symbol_next() return value [ Upstream commit c07d35338081d107e57cf37572d8cc931a8e32e2 ] kallsyms_symbol_next() returns a boolean (true on success). Currently kdb_read() tests the return value with an inequality that unconditionally evaluates to true. This is fixed in the obvious way and, since the conditional branch is supposed to be unreachable, we also add a WARN_ON(). Reported-by: Dan Carpenter Signed-off-by: Daniel Thompson Cc: linux-stable Signed-off-by: Jason Wessel Signed-off-by: Sasha Levin commit 50d50270d3d6b87c6e44822bee7aac736b1fe0f8 Author: Robin Murphy Date: Thu Sep 28 15:14:01 2017 +0100 iommu/vt-d: Fix scatterlist offset handling [ Upstream commit 29a90b70893817e2f2bb3cea40a29f5308e21b21 ] The intel-iommu DMA ops fail to correctly handle scatterlists where sg->offset is greater than PAGE_SIZE - the IOVA allocation is computed appropriately based on the page-aligned portion of the offset, but the mapping is set up relative to sg->page, which means it fails to actually cover the whole buffer (and in the worst case doesn't cover it at all): (sg->dma_address + sg->dma_len) ----+ sg->dma_address ---------+ | iov_pfn------+ | | | | | v v v iova: a b c d e f |--------|--------|--------|--------|--------| <...calculated....> [_____mapped______] pfn: 0 1 2 3 4 5 |--------|--------|--------|--------|--------| ^ ^ ^ | | | sg->page ----+ | | sg->offset --------------+ | (sg->offset + sg->length) ----------+ As a result, the caller ends up overrunning the mapping into whatever lies beyond, which usually goes badly: [ 429.645492] DMAR: DRHD: handling fault status reg 2 [ 429.650847] DMAR: [DMA Write] Request device [02:00.4] fault addr f2682000 ... Whilst this is a fairly rare occurrence, it can happen from the result of intermediate scatterlist processing such as scatterwalk_ffwd() in the crypto layer. Whilst that particular site could be fixed up, it still seems worthwhile to bring intel-iommu in line with other DMA API implementations in handling this robustly. To that end, fix the intel_map_sg() path to line up the mapping correctly (in units of MM pages rather than VT-d pages to match the aligned_nrpages() calculation) regardless of the offset, and use sg_phys() consistently for clarity. Reported-by: Harsh Jain Signed-off-by: Robin Murphy Reviewed by: Ashok Raj Tested by: Jacob Pan Cc: stable@vger.kernel.org Signed-off-by: Alex Williamson Signed-off-by: Sasha Levin commit 1eb1c013cc42e878cb3c87b1587c3bb303dea8d5 Author: Jaejoong Kim Date: Mon Dec 4 15:31:49 2017 +0900 ALSA: usb-audio: Add check return value for usb_string() [ Upstream commit 89b89d121ffcf8d9546633b98ded9d18b8f75891 ] snd_usb_copy_string_desc() returns zero if usb_string() fails. In case of failure, we need to check the snd_usb_copy_string_desc()'s return value and add an exception case Signed-off-by: Jaejoong Kim Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 1972b63db65eeead06e9e8e21bc7e26ef4807769 Author: Jaejoong Kim Date: Mon Dec 4 15:31:48 2017 +0900 ALSA: usb-audio: Fix out-of-bound error [ Upstream commit 251552a2b0d454badc8f486e6d79100970c744b0 ] The snd_usb_copy_string_desc() retrieves the usb string corresponding to the index number through the usb_string(). The problem is that the usb_string() returns the length of the string (>= 0) when successful, but it can also return a negative value about the error case or status of usb_control_msg(). If iClockSource is '0' as shown below, usb_string() will returns -EINVAL. This will result in '0' being inserted into buf[-22], and the following KASAN out-of-bound error message will be output. AudioControl Interface Descriptor: bLength 8 bDescriptorType 36 bDescriptorSubtype 10 (CLOCK_SOURCE) bClockID 1 bmAttributes 0x07 Internal programmable Clock (synced to SOF) bmControls 0x07 Clock Frequency Control (read/write) Clock Validity Control (read-only) bAssocTerminal 0 iClockSource 0 To fix it, check usb_string()'return value and bail out. ================================================================== BUG: KASAN: stack-out-of-bounds in parse_audio_unit+0x1327/0x1960 [snd_usb_audio] Write of size 1 at addr ffff88007e66735a by task systemd-udevd/18376 CPU: 0 PID: 18376 Comm: systemd-udevd Not tainted 4.13.0+ #3 Hardware name: LG Electronics 15N540-RFLGL/White Tip Mountain, BIOS 15N5 Call Trace: dump_stack+0x63/0x8d print_address_description+0x70/0x290 ? parse_audio_unit+0x1327/0x1960 [snd_usb_audio] kasan_report+0x265/0x350 __asan_store1+0x4a/0x50 parse_audio_unit+0x1327/0x1960 [snd_usb_audio] ? save_stack+0xb5/0xd0 ? save_stack_trace+0x1b/0x20 ? save_stack+0x46/0xd0 ? kasan_kmalloc+0xad/0xe0 ? kmem_cache_alloc_trace+0xff/0x230 ? snd_usb_create_mixer+0xb0/0x4b0 [snd_usb_audio] ? usb_audio_probe+0x4de/0xf40 [snd_usb_audio] ? usb_probe_interface+0x1f5/0x440 ? driver_probe_device+0x3ed/0x660 ? build_feature_ctl+0xb10/0xb10 [snd_usb_audio] ? save_stack_trace+0x1b/0x20 ? init_object+0x69/0xa0 ? snd_usb_find_csint_desc+0xa8/0xf0 [snd_usb_audio] snd_usb_mixer_controls+0x1dc/0x370 [snd_usb_audio] ? build_audio_procunit+0x890/0x890 [snd_usb_audio] ? snd_usb_create_mixer+0xb0/0x4b0 [snd_usb_audio] ? kmem_cache_alloc_trace+0xff/0x230 ? usb_ifnum_to_if+0xbd/0xf0 snd_usb_create_mixer+0x25b/0x4b0 [snd_usb_audio] ? snd_usb_create_stream+0x255/0x2c0 [snd_usb_audio] usb_audio_probe+0x4de/0xf40 [snd_usb_audio] ? snd_usb_autosuspend.part.7+0x30/0x30 [snd_usb_audio] ? __pm_runtime_idle+0x90/0x90 ? kernfs_activate+0xa6/0xc0 ? usb_match_one_id_intf+0xdc/0x130 ? __pm_runtime_set_status+0x2d4/0x450 usb_probe_interface+0x1f5/0x440 Cc: Signed-off-by: Jaejoong Kim Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 633255b43de114aefa72610922fa86bbda7cd951 Author: Takashi Iwai Date: Thu Nov 30 10:08:28 2017 +0100 ALSA: seq: Remove spurious WARN_ON() at timer check [ Upstream commit 43a3542870328601be02fcc9d27b09db467336ef ] The use of snd_BUG_ON() in ALSA sequencer timer may lead to a spurious WARN_ON() when a slave timer is deployed as its backend and a corresponding master timer stops meanwhile. The symptom was triggered by syzkaller spontaneously. Since the NULL timer is valid there, rip off snd_BUG_ON(). Reported-by: syzbot Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 0bde6f9d1faf3d4aaf8346d8a326cf02e7ea1a3a Author: Robb Glasser Date: Tue Dec 5 09:16:55 2017 -0800 ALSA: pcm: prevent UAF in snd_pcm_info [ Upstream commit 362bca57f5d78220f8b5907b875961af9436e229 ] When the device descriptor is closed, the `substream->runtime` pointer is freed. But another thread may be in the ioctl handler, case SNDRV_CTL_IOCTL_PCM_INFO. This case calls snd_pcm_info_user() which calls snd_pcm_info() which accesses the now freed `substream->runtime`. Note: this fixes CVE-2017-0861 Signed-off-by: Robb Glasser Signed-off-by: Nick Desaulniers Cc: Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit fad18b48c112a932c80a77f38a5dc26bd0e13a2e Author: Rafael J. Wysocki Date: Fri Dec 1 15:08:12 2017 +0100 x86/PCI: Make broadcom_postcore_init() check acpi_disabled [ Upstream commit ddec3bdee05b06f1dda20ded003c3e10e4184cab ] acpi_os_get_root_pointer() may return a valid address even if acpi_disabled is set, but the host bridge information from the ACPI tables is not going to be used in that case and the Broadcom host bridge initialization should not be skipped then, So make broadcom_postcore_init() check acpi_disabled too to avoid this issue. Fixes: 6361d72b04d1 (x86/PCI: read Broadcom CNB20LE host bridge info before PCI scan) Reported-by: Dave Hansen Signed-off-by: Rafael J. Wysocki Signed-off-by: Thomas Gleixner Cc: Bjorn Helgaas Cc: Linux PCI Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/3186627.pxZj1QbYNg@aspire.rjw.lan Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 110bf7c92a9910ba877eb30006025cf50a143cbb Author: Eric Biggers Date: Fri Dec 8 15:13:27 2017 +0000 X.509: reject invalid BIT STRING for subjectPublicKey [ Upstream commit 0f30cbea005bd3077bd98cd29277d7fc2699c1da ] Adding a specially crafted X.509 certificate whose subjectPublicKey ASN.1 value is zero-length caused x509_extract_key_data() to set the public key size to SIZE_MAX, as it subtracted the nonexistent BIT STRING metadata byte. Then, x509_cert_parse() called kmemdup() with that bogus size, triggering the WARN_ON_ONCE() in kmalloc_slab(). This appears to be harmless, but it still must be fixed since WARNs are never supposed to be user-triggerable. Fix it by updating x509_cert_parse() to validate that the value has a BIT STRING metadata byte, and that the byte is 0 which indicates that the number of bits in the bitstring is a multiple of 8. It would be nice to handle the metadata byte in asn1_ber_decoder() instead. But that would be tricky because in the general case a BIT STRING could be implicitly tagged, and/or could legitimately have a length that is not a whole number of bytes. Here was the WARN (cleaned up slightly): WARNING: CPU: 1 PID: 202 at mm/slab_common.c:971 kmalloc_slab+0x5d/0x70 mm/slab_common.c:971 Modules linked in: CPU: 1 PID: 202 Comm: keyctl Tainted: G B 4.14.0-09238-g1d3b78bbc6e9 #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 task: ffff880033014180 task.stack: ffff8800305c8000 Call Trace: __do_kmalloc mm/slab.c:3706 [inline] __kmalloc_track_caller+0x22/0x2e0 mm/slab.c:3726 kmemdup+0x17/0x40 mm/util.c:118 kmemdup include/linux/string.h:414 [inline] x509_cert_parse+0x2cb/0x620 crypto/asymmetric_keys/x509_cert_parser.c:106 x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174 asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388 key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850 SYSC_add_key security/keys/keyctl.c:122 [inline] SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62 entry_SYSCALL_64_fastpath+0x1f/0x96 Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder") Cc: # v3.7+ Signed-off-by: Eric Biggers Signed-off-by: David Howells Reviewed-by: James Morris Signed-off-by: Sasha Levin commit c75c77e330f1d619a1e505620fa3d4ebca9b4b8e Author: Eric Biggers Date: Fri Dec 8 15:13:27 2017 +0000 ASN.1: check for error from ASN1_OP_END__ACT actions [ Upstream commit 81a7be2cd69b412ab6aeacfe5ebf1bb6e5bce955 ] asn1_ber_decoder() was ignoring errors from actions associated with the opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT, ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT. In practice, this meant the pkcs7_note_signed_info() action (since that was the only user of those opcodes). Fix it by checking for the error, just like the decoder does for actions associated with the other opcodes. This bug allowed users to leak slab memory by repeatedly trying to add a specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY). In theory, this bug could also be used to bypass module signature verification, by providing a PKCS#7 message that is misparsed such that a signature's ->authattrs do not contain its ->msgdigest. But it doesn't seem practical in normal cases, due to restrictions on the format of the ->authattrs. Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder") Cc: # v3.7+ Signed-off-by: Eric Biggers Signed-off-by: David Howells Reviewed-by: James Morris Signed-off-by: Sasha Levin commit cb18143c205ee4e7310c22b29c3f6851e8b97029 Author: Huacai Chen Date: Tue Nov 21 14:23:39 2017 +0100 scsi: libsas: align sata_device's rps_resp on a cacheline [ Upstream commit c2e8fbf908afd81ad502b567a6639598f92c9b9d ] The rps_resp buffer in ata_device is a DMA target, but it isn't explicitly cacheline aligned. Due to this, adjacent fields can be overwritten with stale data from memory on non-coherent architectures. As a result, the kernel is sometimes unable to communicate with an SATA device behind a SAS expander. Fix this by ensuring that the rps_resp buffer is cacheline aligned. This issue is similar to that fixed by Commit 84bda12af31f93 ("libata: align ap->sector_buf") and Commit 4ee34ea3a12396f35b26 ("libata: Align ata_device's id on a cacheline"). Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen Signed-off-by: Christoph Hellwig Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 993e0f72a3ce3e0ac86e83d84515c7825b580812 Author: William Breathitt Gray Date: Wed Nov 8 10:23:11 2017 -0500 isa: Prevent NULL dereference in isa_bus driver callbacks [ Upstream commit 5a244727f428a06634f22bb890e78024ab0c89f3 ] The isa_driver structure for an isa_bus device is stored in the device platform_data member of the respective device structure. This platform_data member may be reset to NULL if isa_driver match callback for the device fails, indicating a device unsupported by the ISA driver. This patch fixes a possible NULL pointer dereference if one of the isa_driver callbacks to attempted for an unsupported device. This error should not occur in practice since ISA devices are typically manually configured and loaded by the users, but we may as well prevent this error from popping up for the 0day testers. Fixes: a5117ba7da37 ("[PATCH] Driver model: add ISA bus") Signed-off-by: William Breathitt Gray Cc: stable Acked-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit f0d372ffcd7b2ffe8ceaf3658b279ff54f5e8324 Author: Paul Meyer Date: Tue Nov 14 13:06:47 2017 -0700 hv: kvp: Avoid reading past allocated blocks from KVP file [ Upstream commit 297d6b6e56c2977fc504c61bbeeaa21296923f89 ] While reading in more than one block (50) of KVP records, the allocation goes per block, but the reads used the total number of allocated records (without resetting the pointer/stream). This causes the records buffer to overrun when the refresh reads more than one block over the previous capacity (e.g. reading more than 100 KVP records whereas the in-memory database was empty before). Fix this by reading the correct number of KVP records from file each time. Signed-off-by: Paul Meyer Signed-off-by: Long Li Cc: stable@vger.kernel.org Signed-off-by: K. Y. Srinivasan Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit fb30b3f000902ff8c4368c2d929c96ed777d15d7 Author: weiping zhang Date: Wed Nov 29 09:23:01 2017 +0800 virtio: release virtio index when fail to device_register [ Upstream commit e60ea67bb60459b95a50a156296041a13e0e380e ] index can be reused by other virtio device. Cc: stable@vger.kernel.org Signed-off-by: weiping zhang Reviewed-by: Cornelia Huck Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 8a7022f7625040e4295d89f72200cabd8195bdaf Author: Martin Kelly Date: Tue Dec 5 11:15:50 2017 -0800 can: usb_8dev: cancel urb on -EPIPE and -EPROTO [ Upstream commit 12147edc434c9e4c7c2f5fee2e5519b2e5ac34ce ] In mcba_usb, we have observed that when you unplug the device, the driver will endlessly resubmit failing URBs, which can cause CPU stalls. This issue is fixed in mcba_usb by catching the codes seen on device disconnect (-EPIPE and -EPROTO). This driver also resubmits in the case of -EPIPE and -EPROTO, so fix it in the same way. Signed-off-by: Martin Kelly Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit 8b65515d121586ebbfb1440768e35b207dde98b2 Author: Martin Kelly Date: Tue Dec 5 11:15:48 2017 -0800 can: esd_usb2: cancel urb on -EPIPE and -EPROTO [ Upstream commit 7a31ced3de06e9878e4f9c3abe8f87d9344d8144 ] In mcba_usb, we have observed that when you unplug the device, the driver will endlessly resubmit failing URBs, which can cause CPU stalls. This issue is fixed in mcba_usb by catching the codes seen on device disconnect (-EPIPE and -EPROTO). This driver also resubmits in the case of -EPIPE and -EPROTO, so fix it in the same way. Signed-off-by: Martin Kelly Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit 1faeca460ed7144544652dac82412e01a5ca9f29 Author: Martin Kelly Date: Tue Dec 5 11:15:47 2017 -0800 can: ems_usb: cancel urb on -EPIPE and -EPROTO [ Upstream commit bd352e1adfe0d02d3ea7c8e3fb19183dc317e679 ] In mcba_usb, we have observed that when you unplug the device, the driver will endlessly resubmit failing URBs, which can cause CPU stalls. This issue is fixed in mcba_usb by catching the codes seen on device disconnect (-EPIPE and -EPROTO). This driver also resubmits in the case of -EPIPE and -EPROTO, so fix it in the same way. Signed-off-by: Martin Kelly Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit bf4857475f657204552d790fd979577076d6e4ce Author: Martin Kelly Date: Tue Dec 5 11:15:49 2017 -0800 can: kvaser_usb: cancel urb on -EPIPE and -EPROTO [ Upstream commit 6aa8d5945502baf4687d80de59b7ac865e9e666b ] In mcba_usb, we have observed that when you unplug the device, the driver will endlessly resubmit failing URBs, which can cause CPU stalls. This issue is fixed in mcba_usb by catching the codes seen on device disconnect (-EPIPE and -EPROTO). This driver also resubmits in the case of -EPIPE and -EPROTO, so fix it in the same way. Signed-off-by: Martin Kelly Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit 16ec63f0f106391c728010e00263f7ee1f0d75c8 Author: Jimmy Assarsson Date: Tue Nov 21 08:22:28 2017 +0100 can: kvaser_usb: ratelimit errors if incomplete messages are received [ Upstream commit 8bd13bd522ff7dfa0eb371921aeb417155f7a3be ] Avoid flooding the kernel log with "Formate error", if incomplete message are received. Signed-off-by: Jimmy Assarsson Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit 5875c419e115db4f1cdd0fbc7dfac08b88cade9b Author: Jimmy Assarsson Date: Tue Nov 21 08:22:27 2017 +0100 can: kvaser_usb: Fix comparison bug in kvaser_usb_read_bulk_callback() [ Upstream commit e84f44eb5523401faeb9cc1c97895b68e3cfb78d ] The conditon in the while-loop becomes true when actual_length is less than 2 (MSG_HEADER_LEN). In best case we end up with a former, already dispatched msg, that got msg->len greater than actual_length. This will result in a "Format error" error printout. Problem seen when unplugging a Kvaser USB device connected to a vbox guest. warning: comparison between signed and unsigned integer expressions [-Wsign-compare] Signed-off-by: Jimmy Assarsson Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit db0081b8eab496df2b7c94ccce32e3ebb8aaefd8 Author: Jimmy Assarsson Date: Tue Nov 21 08:22:26 2017 +0100 can: kvaser_usb: free buf in error paths [ Upstream commit 435019b48033138581a6171093b181fc6b4d3d30 ] The allocated buffer was not freed if usb_submit_urb() failed. Signed-off-by: Jimmy Assarsson Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit 47a7b5ceecabda125be97c710831bb631ad7f8e9 Author: Oliver Stäbler Date: Mon Nov 20 14:45:15 2017 +0100 can: ti_hecc: Fix napi poll return value for repoll [ Upstream commit f6c23b174c3c96616514827407769cbcfc8005cf ] After commit d75b1ade567f ("net: less interrupt masking in NAPI") napi repoll is done only when work_done == budget. So we need to return budget if there are still packets to receive. Signed-off-by: Oliver Stäbler Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit be83ae76a679737f448472b44daeb03298dce61e Author: Colin Ian King Date: Tue Nov 7 16:45:04 2017 +0000 usb: host: fix incorrect updating of offset [ Upstream commit 1d5a31582ef046d3b233f0da1a68ae26519b2f0a ] The variable temp is incorrectly being updated, instead it should be offset otherwise the loop just reads the same capability value and loops forever. Thanks to Alan Stern for pointing out the correct fix to my original fix. Fix also cleans up clang warning: drivers/usb/host/ehci-dbg.c:840:4: warning: Value stored to 'temp' is never read Fixes: d49d43174400 ("USB: misc ehci updates") Cc: stable Signed-off-by: Colin Ian King Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 6951610430a4ac63c48d7a810bbad34d58722e02 Author: Oliver Neukum Date: Thu Nov 23 16:39:52 2017 +0100 USB: usbfs: Filter flags passed in from user space [ Upstream commit 446f666da9f019ce2ffd03800995487e79a91462 ] USBDEVFS_URB_ISO_ASAP must be accepted only for ISO endpoints. Improve sanity checking. Reported-by: Andrey Konovalov Signed-off-by: Oliver Neukum Cc: stable Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit d2eff96c49c9af5fe2dc33afec2ebb2e5b18f613 Author: Dan Carpenter Date: Fri Sep 22 23:43:25 2017 +0300 USB: devio: Prevent integer overflow in proc_do_submiturb() [ Upstream commit 57999d1107c1e60c2ca7088f2ac0f819e2f554b3 ] There used to be an integer overflow check in proc_do_submiturb() but we removed it. It turns out that it's still required. The uurb->buffer_length variable is a signed integer and it's controlled by the user. It can lead to an integer overflow when we do: num_sgs = DIV_ROUND_UP(uurb->buffer_length, USB_SG_SIZE); If we strip away the macro then that line looks like this: num_sgs = (uurb->buffer_length + USB_SG_SIZE - 1) / USB_SG_SIZE; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ It's the first addition which can overflow. Fixes: 1129d270cbfb ("USB: Increase usbfs transfer limit") Cc: stable Signed-off-by: Dan Carpenter Acked-by: Alan Stern Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit c91c25944050752582c79608f0dbe2a6a265b305 Author: Mateusz Berezecki Date: Wed Dec 21 09:19:14 2016 -0800 USB: Increase usbfs transfer limit [ Upstream commit 1129d270cbfbb7e2b1ec3dede4a13930bdd10e41 ] Promote a variable keeping track of USB transfer memory usage to a wider data type and allow for higher bandwidth transfers from a large number of USB devices connected to a single host. Signed-off-by: Mateusz Berezecki Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 3800bd5096064fbe93329c33ab8442852d50a61c Author: Yu Chen Date: Fri Dec 1 13:41:20 2017 +0200 usb: xhci: fix panic in xhci_free_virt_devices_depth_first [ Upstream commit 80e457699a8dbdd70f2d26911e46f538645c55fc ] Check vdev->real_port 0 to avoid panic [ 9.261347] [] xhci_free_virt_devices_depth_first+0x58/0x108 [ 9.261352] [] xhci_mem_cleanup+0x1bc/0x570 [ 9.261355] [] xhci_stop+0x140/0x1c8 [ 9.261365] [] usb_remove_hcd+0xfc/0x1d0 [ 9.261369] [] xhci_plat_remove+0x6c/0xa8 [ 9.261377] [] platform_drv_remove+0x2c/0x70 [ 9.261384] [] __device_release_driver+0x80/0x108 [ 9.261387] [] device_release_driver+0x2c/0x40 [ 9.261392] [] bus_remove_device+0xe0/0x120 [ 9.261396] [] device_del+0x114/0x210 [ 9.261399] [] platform_device_del+0x30/0xa0 [ 9.261403] [] dwc3_otg_work+0x204/0x488 [ 9.261407] [] event_work+0x304/0x5b8 [ 9.261414] [] process_one_work+0x148/0x490 [ 9.261417] [] worker_thread+0x50/0x4a0 [ 9.261421] [] kthread+0xe8/0x100 [ 9.261427] [] ret_from_fork+0x10/0x50 The problem can occur if xhci_plat_remove() is called shortly after xhci_plat_probe(). While xhci_free_virt_devices_depth_first been called before the device has been setup and get real_port initialized. The problem occurred on Hikey960 and was reproduced by Guenter Roeck on Kevin with chromeos-4.4. Fixes: ee8665e28e8d ("xhci: free xhci virtual devices with leaf nodes first") Cc: Guenter Roeck Cc: # v4.10+ Reviewed-by: Guenter Roeck Tested-by: Guenter Roeck Signed-off-by: Fan Ning Signed-off-by: Li Rui Signed-off-by: yangdi Signed-off-by: Yu Chen Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 8e7b166320edf88a778c2e108aa7faed81af0a40 Author: Mike Looijmans Date: Thu Nov 9 13:16:46 2017 +0100 usb: hub: Cycle HUB power when initialization fails [ Upstream commit 973593a960ddac0f14f0d8877d2d0abe0afda795 ] Sometimes the USB device gets confused about the state of the initialization and the connection fails. In particular, the device thinks that it's already set up and running while the host thinks the device still needs to be configured. To work around this issue, power-cycle the hub's output to issue a sort of "reset" to the device. This makes the device restart its state machine and then the initialization succeeds. This fixes problems where the kernel reports a list of errors like this: usb 1-1.3: device not accepting address 19, error -71 The end result is a non-functioning device. After this patch, the sequence becomes like this: usb 1-1.3: new high-speed USB device number 18 using ci_hdrc usb 1-1.3: device not accepting address 18, error -71 usb 1-1.3: new high-speed USB device number 19 using ci_hdrc usb 1-1.3: device not accepting address 19, error -71 usb 1-1-port3: attempt power cycle usb 1-1.3: new high-speed USB device number 21 using ci_hdrc usb-storage 1-1.3:1.2: USB Mass Storage device detected Signed-off-by: Mike Looijmans Acked-by: Alan Stern Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 9dc021c202539c38b9f321806b0d2fc0b7f0e14b Author: Rui Sousa Date: Mon Feb 13 10:01:25 2017 +0800 net: fec: fix multicast filtering hardware setup [ Upstream commit 01f8902bcf3ff124d0aeb88a774180ebcec20ace ] Fix hardware setup of multicast address hash: - Never clear the hardware hash (to avoid packet loss) - Construct the hash register values in software and then write once to hardware Signed-off-by: Rui Sousa Signed-off-by: Fugang Duan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 8bae4d83e608649a8432b0234755b518de6f3bf2 Author: Jan Kara Date: Wed Feb 8 14:30:53 2017 -0800 mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers [ Upstream commit 0911d0041c22922228ca52a977d7b0b0159fee4b ] Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY from ->page_mkwrite is completely unhandled by the mm code and results in locking and writeably mapping the page which definitely is not what the caller wanted. Fix Lustre and block_page_mkwrite_ret() used by other filesystems (notably GFS2) to return VM_FAULT_NOPAGE instead which results in bailing out from the fault code, the CPU then retries the access, and we fault again effectively doing what the handler wanted. Link: http://lkml.kernel.org/r/20170203150729.15863-1-jack@suse.cz Signed-off-by: Jan Kara Reported-by: Al Viro Reviewed-by: Jinshan Xiong Cc: Matthew Wilcox Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit dc09971d7092130626f4ba86d7b55d3c3cb6faf9 Author: Jason Baron Date: Tue Jan 24 21:49:41 2017 -0500 tcp: correct memory barrier usage in tcp_check_space() [ Upstream commit 56d806222ace4c3aeae516cd7a855340fb2839d8 ] sock_reset_flag() maps to __clear_bit() not the atomic version clear_bit(). Thus, we need smp_mb(), smp_mb__after_atomic() is not sufficient. Fixes: 3c7151275c0c ("tcp: add memory barriers to write space paths") Cc: Eric Dumazet Cc: Oleg Nesterov Signed-off-by: Jason Baron Acked-by: Eric Dumazet Reported-by: Oleg Nesterov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 940fc0fc3010ed9aaec006a8c85fce00eb7d7be3 Author: Iago Abal Date: Wed Jan 11 14:00:21 2017 +0100 dmaengine: pl330: fix double lock [ Upstream commit 91539eb1fda2d530d3b268eef542c5414e54bf1a ] The static bug finder EBA (http://www.iagoabal.eu/eba/) reported the following double-lock bug: Double lock: 1. spin_lock_irqsave(pch->lock, flags) at pl330_free_chan_resources:2236; 2. call to function `pl330_release_channel' immediately after; 3. call to function `dma_pl330_rqcb' in line 1753; 4. spin_lock_irqsave(pch->lock, flags) at dma_pl330_rqcb:1505. I have fixed it as suggested by Marek Szyprowski. First, I have replaced `pch->lock' with `pl330->lock' in functions `pl330_alloc_chan_resources' and `pl330_free_chan_resources'. This avoids the double-lock by acquiring a different lock than `dma_pl330_rqcb'. NOTE that, as a result, `pl330_free_chan_resources' executes `list_splice_tail_init' on `pch->work_list' under lock `pl330->lock', whereas in the rest of the code `pch->work_list' is protected by `pch->lock'. I don't know if this may cause race conditions. Similarly `pch->cyclic' is written by `pl330_alloc_chan_resources' under `pl330->lock' but read by `pl330_tx_submit' under `pch->lock'. Second, I have removed locking from `pl330_request_channel' and `pl330_release_channel' functions. Function `pl330_request_channel' is only called from `pl330_alloc_chan_resources', so the lock is already held. Function `pl330_release_channel' is called from `pl330_free_chan_resources', which already holds the lock, and from `pl330_del'. Function `pl330_del' is called in an error path of `pl330_probe' and at the end of `pl330_remove', but I assume that there cannot be concurrent accesses to the protected data at those points. Signed-off-by: Iago Abal Reviewed-by: Marek Szyprowski Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit b40bf0f48cb0b58ba64b924545c8cbfce5d48e25 Author: Parthasarathy Bhuvaragan Date: Tue Jan 24 13:00:48 2017 +0100 tipc: fix cleanup at module unload [ Upstream commit 35e22e49a5d6a741ebe7f2dd280b2052c3003ef7 ] In tipc_server_stop(), we iterate over the connections with limiting factor as server's idr_in_use. We ignore the fact that this variable is decremented in tipc_close_conn(), leading to premature exit. In this commit, we iterate until the we have no connections left. Acked-by: Ying Xue Acked-by: Jon Maloy Tested-by: John Thompson Signed-off-by: Parthasarathy Bhuvaragan Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f37ddb16f1c3b362b65bf60b6ae47a61aef90bdf Author: Colin Ian King Date: Fri Jan 20 13:01:57 2017 +0000 net: sctp: fix array overrun read on sctp_timer_tbl [ Upstream commit 0e73fc9a56f22f2eec4d2b2910c649f7af67b74d ] The comparison on the timeout can lead to an array overrun read on sctp_timer_tbl because of an off-by-one error. Fix this by using < instead of <= and also compare to the array size rather than SCTP_EVENT_TIMEOUT_MAX. Fixes CoverityScan CID#1397639 ("Out-of-bounds read") Signed-off-by: Colin Ian King Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 6db42b569047e3edcbf6d4333db46c8890249407 Author: Trond Myklebust Date: Fri Jan 13 13:31:32 2017 -0500 NFSv4: Fix client recovery when server reboots multiple times [ Upstream commit c6180a6237174f481dc856ed6e890d8196b6f0fb ] If the server reboots multiple times, the client should rely on the server to tell it that it cannot reclaim state as per section 9.6.3.4 in RFC7530 and section 8.4.2.1 in RFC5661. Currently, the client is being to conservative, and is assuming that if the server reboots while state recovery is in progress, then it must ignore state that was not recovered before the reboot. Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit e82ffdf0df73c325c8c324ab680447e5b45ea131 Author: Benjamin Coddington Date: Thu Jan 5 10:20:16 2017 -0500 nfs: Don't take a reference on fl->fl_file for LOCK operation [ Upstream commit 4b09ec4b14a168bf2c687e1f598140c3c11e9222 ] I have reports of a crash that look like __fput() was called twice for a NFSv4.0 file. It seems possible that the state manager could try to reclaim a lock and take a reference on the fl->fl_file at the same time the file is being released if, during the close(), a signal interrupts the wait for outstanding IO while removing locks which then skips the removal of that lock. Since 83bfff23e9ed ("nfs4: have do_vfs_lock take an inode pointer") has removed the need to traverse fl->fl_file->f_inode in nfs4_lock_done(), taking that reference is no longer necessary. Signed-off-by: Benjamin Coddington Reviewed-by: Jeff Layton Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin commit f3fca5512aea1b3c8519c0c378cdcb41e9f0658a Author: Vlad Tsyrklevich Date: Mon Jan 9 20:57:48 2017 +0700 net/appletalk: Fix kernel memory disclosure [ Upstream commit ce7e40c432ba84da104438f6799d460a4cad41bc ] ipddp_route structs contain alignment padding so kernel heap memory is leaked when they are copied to user space in ipddp_ioctl(SIOCFINDIPDDPRT). Change kmalloc() to kzalloc() to clear that memory. Signed-off-by: Vlad Tsyrklevich Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c1dae9f5f4612a33a253e07f8fcd80cd39949f09 Author: David Forster Date: Fri Jan 6 10:27:59 2017 +0000 vti6: fix device register to report IFLA_INFO_KIND [ Upstream commit 93e246f783e6bd1bc64fdfbfe68b18161f69b28e ] vti6 interface is registered before the rtnl_link_ops block is attached. As a result the resulting RTM_NEWLINK is missing IFLA_INFO_KIND. Re-order attachment of rtnl_link_ops block to fix. Signed-off-by: Dave Forster Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 63745efc72c36d450e46e2d710516de8a061054d Author: Peter Ujfalusi Date: Tue Jan 3 13:22:34 2017 +0200 ARM: OMAP1: DMA: Correct the number of logical channels [ Upstream commit 657279778af54f35e54b07b6687918f254a2992c ] OMAP1510, OMAP5910 and OMAP310 have only 9 logical channels. OMAP1610, OMAP5912, OMAP1710, OMAP730, and OMAP850 have 16 logical channels available. The wired 17 for the lch_count must have been used to cover the 16 + 1 dedicated LCD channel, in reality we can only use 9 or 16 channels. The d->chan_count is not used by the omap-dma stack, so we can skip the setup. chan_count was configured to the number of logical channels and not the actual number of physical channels anyways. Signed-off-by: Peter Ujfalusi Acked-by: Aaro Koskinen Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin commit cbbdc8f66360272ad19f6094baaa61e457d16276 Author: Florian Fainelli Date: Tue Jan 3 16:34:49 2017 -0800 net: systemport: Pad packet before inserting TSB [ Upstream commit 38e5a85562a6cd911fc26d951d576551a688574c ] Inserting the TSB means adding an extra 8 bytes in front the of packet that is going to be used as metadata information by the TDMA engine, but stripped off, so it does not really help with the packet padding. For some odd packet sizes that fall below the 60 bytes payload (e.g: ARP) we can end-up padding them after the TSB insertion, thus making them 64 bytes, but with the TDMA stripping off the first 8 bytes, they could still be smaller than 64 bytes which is required to ingress the switch. Fix this by swapping the padding and TSB insertion, guaranteeing that the packets have the right sizes. Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1e3183d48594ffbbf8c3393f18f1069176afee1c Author: Florian Fainelli Date: Tue Jan 3 16:34:48 2017 -0800 net: systemport: Utilize skb_put_padto() [ Upstream commit bb7da333d0a9f3bddc08f84187b7579a3f68fd24 ] Since we need to pad our packets, utilize skb_put_padto() which increases skb->len by how much we need to pad, allowing us to eliminate the test on skb->len right below. Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f553cd8397d90eea5580f6ea6cf4bd09ef24852f Author: Masami Hiramatsu Date: Tue Sep 19 19:01:40 2017 +0900 kprobes/x86: Disable preemption in ftrace-based jprobes [ Upstream commit 5bb4fc2d8641219732eb2bb654206775a4219aca ] Disable preemption in ftrace-based jprobe handlers as described in Documentation/kprobes.txt: "Probe handlers are run with preemption disabled." This will fix jprobes behavior when CONFIG_PREEMPT=y. Signed-off-by: Masami Hiramatsu Cc: Alexei Starovoitov Cc: Alexei Starovoitov Cc: Ananth N Mavinakayanahalli Cc: Linus Torvalds Cc: Paul E . McKenney Cc: Peter Zijlstra Cc: Steven Rostedt Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/150581530024.32348.9863783558598926771.stgit@devbox Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin commit 870743f8d9ad65f9bbde7832a5bfd5518ca35a67 Author: Thomas Richter Date: Wed Sep 13 10:12:09 2017 +0200 perf test attr: Fix ignored test case result [ Upstream commit 22905582f6dd4bbd0c370fe5732c607452010c04 ] Command perf test -v 16 (Setup struct perf_event_attr test) always reports success even if the test case fails. It works correctly if you also specify -F (for don't fork). root@s35lp76 perf]# ./perf test -v 16 15: Setup struct perf_event_attr : --- start --- running './tests/attr/test-record-no-delay' [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.002 MB /tmp/tmp4E1h7R/perf.data (1 samples) ] expected task=0, got 1 expected precise_ip=0, got 3 expected wakeup_events=1, got 0 FAILED './tests/attr/test-record-no-delay' - match failure test child finished with 0 ---- end ---- Setup struct perf_event_attr: Ok The reason for the wrong error reporting is the return value of the system() library call. It is called in run_dir() file tests/attr.c and returns the exit status, in above case 0xff00. This value is given as parameter to the exit() function which can only handle values 0-0xff. The child process terminates with exit value of 0 and the parent does not detect any error. This patch corrects the error reporting and prints the correct test result. Signed-off-by: Thomas-Mich Richter Acked-by: Jiri Olsa Cc: Heiko Carstens Cc: Hendrik Brueckner Cc: Martin Schwidefsky Cc: Thomas-Mich Richter LPU-Reference: 20170913081209.39570-2-tmricht@linux.vnet.ibm.com Link: http://lkml.kernel.org/n/tip-rdube6rfcjsr1nzue72c7lqn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin commit bc26e3e8be2ad05a6f6b508f0c3a862b44d85c86 Author: Jibin Xu Date: Sun Sep 10 20:11:42 2017 -0700 sysrq : fix Show Regs call trace on ARM [ Upstream commit b00bebbc301c8e1f74f230dc82282e56b7e7a6db ] When kernel configuration SMP,PREEMPT and DEBUG_PREEMPT are enabled, echo 1 >/proc/sys/kernel/sysrq echo p >/proc/sysrq-trigger kernel will print call trace as below: sysrq: SysRq : Show Regs BUG: using __this_cpu_read() in preemptible [00000000] code: sh/435 caller is __this_cpu_preempt_check+0x18/0x20 Call trace: [] dump_backtrace+0x0/0x1d0 [] show_stack+0x24/0x30 [] dump_stack+0x90/0xb0 [] check_preemption_disabled+0x100/0x108 [] __this_cpu_preempt_check+0x18/0x20 [] sysrq_handle_showregs+0x1c/0x40 [] __handle_sysrq+0x12c/0x1a0 [] write_sysrq_trigger+0x60/0x70 [] proc_reg_write+0x90/0xd0 [] __vfs_write+0x48/0x90 [] vfs_write+0xa4/0x190 [] SyS_write+0x54/0xb0 [] el0_svc_naked+0x24/0x28 This can be seen on a common board like an r-pi3. This happens because when echo p >/proc/sysrq-trigger, get_irq_regs() is called outside of IRQ context, if preemption is enabled in this situation,kernel will print the call trace. Since many prior discussions on the mailing lists have made it clear that get_irq_regs either just returns NULL or stale data when used outside of IRQ context,we simply avoid calling it outside of IRQ context. Signed-off-by: Jibin Xu Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 668e191ced4dbf483fedb3f98920b014b1ecb446 Author: Gustavo A. R. Silva Date: Mon Oct 16 12:40:29 2017 -0500 EDAC, sb_edac: Fix missing break in switch [ Upstream commit a8e9b186f153a44690ad0363a56716e7077ad28c ] Add missing break statement in order to prevent the code from falling through. Signed-off-by: Gustavo A. R. Silva Cc: Qiuxu Zhuo Cc: linux-edac Link: http://lkml.kernel.org/r/20171016174029.GA19757@embeddedor.com Signed-off-by: Borislav Petkov Signed-off-by: Sasha Levin commit d65d97837a83dc61d60917b9eda9f5fe4d7dd7e4 Author: Hiromitsu Yamasaki Date: Thu Nov 2 10:32:36 2017 +0100 spi: sh-msiof: Fix DMA transfer size check [ Upstream commit 36735783fdb599c94b9c86824583df367c65900b ] DMA supports 32-bit words only, even if BITLEN1 of SITMDR2 register is 16bit. Fixes: b0d0ce8b6b91 ("spi: sh-msiof: Add DMA support") Signed-off-by: Hiromitsu Yamasaki Signed-off-by: Simon Horman Acked-by: Geert Uytterhoeven Acked-by: Dirk Behme Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit bbf799a53f80fc15e256c30cb10bee1d5f80cfb3 Author: Lukas Wunner Date: Sat Oct 28 11:35:49 2017 +0200 serial: 8250_fintek: Fix rs485 disablement on invalid ioctl() [ Upstream commit 3236a965486ba0c6043cf2c7b51943d8b382ae29 ] This driver's ->rs485_config callback checks if SER_RS485_RTS_ON_SEND and SER_RS485_RTS_AFTER_SEND have the same value. If they do, it means the user has passed in invalid data with the TIOCSRS485 ioctl() since RTS must have a different polarity when sending and when not sending. In this case, rs485 mode is not enabled (the RS485_URA bit is not set in the RS485 Enable Register) and this is supposed to be signaled back to the user by clearing the SER_RS485_ENABLED bit in struct serial_rs485 ... except a missing tilde character is preventing that from happening. Fixes: 28e3fb6c4dce ("serial: Add support for Fintek F81216A LPC to 4 UART") Cc: Ricardo Ribalda Delgado Cc: "Ji-Ze Hong (Peter Hong)" Signed-off-by: Lukas Wunner Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 282dc3f75ee440d2c74427546ccc76237b843e1f Author: Christian Borntraeger Date: Mon Oct 30 14:38:58 2017 +0100 s390/pci: do not require AIS facility [ Upstream commit 48070c73058be6de9c0d754d441ed7092dfc8f12 ] As of today QEMU does not provide the AIS facility to its guest. This prevents Linux guests from using PCI devices as the ais facility is checked during init. As this is just a performance optimization, we can move the ais check into the code where we need it (calling the SIC instruction). This is used at initialization and on interrupt. Both places do not require any serialization, so we can simply skip the instruction. Since we will now get all interrupts, we can also avoid the 2nd scan. As we can have multiple interrupts in parallel we might trigger spurious irqs more often for the non-AIS case but the core code can handle that. Signed-off-by: Christian Borntraeger Reviewed-by: Pierre Morel Reviewed-by: Halil Pasic Acked-by: Sebastian Ott Signed-off-by: Heiko Carstens Signed-off-by: Sasha Levin commit 1203cd50416157f5dba4742a7c50b2a6c9efd344 Author: Boshi Wang Date: Fri Oct 20 16:01:03 2017 +0800 ima: fix hash algorithm initialization [ Upstream commit ebe7c0a7be92bbd34c6ff5b55810546a0ee05bee ] The hash_setup function always sets the hash_setup_done flag, even when the hash algorithm is invalid. This prevents the default hash algorithm defined as CONFIG_IMA_DEFAULT_HASH from being used. This patch sets hash_setup_done flag only for valid hash algorithms. Fixes: e7a2ad7eb6f4 "ima: enable support for larger default filedata hash algorithms" Signed-off-by: Boshi Wang Signed-off-by: Mimi Zohar Signed-off-by: Sasha Levin commit 553111085c72988d535381aa679dfddee39c24b0 Author: Matt Wilson Date: Mon Nov 13 11:31:31 2017 -0800 serial: 8250_pci: Add Amazon PCI serial device ID [ Upstream commit 3bfd1300abfe3adb18e84a89d97a0e82a22124bb ] This device will be used in future Amazon EC2 instances as the primary serial port (i.e., data sent to this port will be available via the GetConsoleOuput [1] EC2 API). [1] http://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_GetConsoleOutput.html Cc: stable Signed-off-by: Matt Wilson Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit eba5e6ff42d9bd8c33b301d4bd6ebd1892bd1b80 Author: Kai-Heng Feng Date: Tue Nov 14 01:31:15 2017 -0500 usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub [ Upstream commit e43a12f1793ae1fe006e26fe9327a8840a92233c ] KY-688 USB 3.1 Type-C Hub internally uses a Genesys Logic hub to connect to Realtek r8153. Similar to commit ("7496cfe5431f2 usb: quirks: Add no-lpm quirk for Moshi USB to Ethernet Adapter"), no-lpm can make r8153 ethernet work. Signed-off-by: Kai-Heng Feng Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit b55902dd3ce515ee45177c2926358fe0fb077d27 Author: Hans de Goede Date: Tue Nov 14 19:27:22 2017 +0100 uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices [ Upstream commit 7fee72d5e8f1e7b8d8212e28291b1a0243ecf2f1 ] We've been adding this as a quirk on a per device basis hoping that newer disk enclosures would do better, but that has not happened, so simply apply this quirk to all Seagate devices. Signed-off-by: Hans de Goede Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 37993da309a63f734578df21e2541094ad39d7f2 Author: Rui Hua Date: Fri Nov 24 15:14:26 2017 -0800 bcache: recover data from backing when data is clean [ Upstream commit e393aa2446150536929140739f09c6ecbcbea7f0 ] When we send a read request and hit the clean data in cache device, there is a situation called cache read race in bcache(see the commit in the tail of cache_look_up(), the following explaination just copy from there): The bucket we're reading from might be reused while our bio is in flight, and we could then end up reading the wrong data. We guard against this by checking (in bch_cache_read_endio()) if the pointer is stale again; if so, we treat it as an error (s->iop.error = -EINTR) and reread from the backing device (but we don't pass that error up anywhere) It should be noted that cache read race happened under normal circumstances, not the circumstance when SSD failed, it was counted and shown in /sys/fs/bcache/XXX/internal/cache_read_races. Without this patch, when we use writeback mode, we will never reread from the backing device when cache read race happened, until the whole cache device is clean, because the condition (s->recoverable && (dc && !atomic_read(&dc->has_dirty))) is false in cached_dev_read_error(). In this situation, the s->iop.error(= -EINTR) will be passed up, at last, user will receive -EINTR when it's bio end, this is not suitable, and wield to up-application. In this patch, we use s->read_dirty_data to judge whether the read request hit dirty data in cache device, it is safe to reread data from the backing device when the read request hit clean data. This can not only handle cache read race, but also recover data when failed read request from cache device. [edited by mlyle to fix up whitespace, commit log title, comment spelling] Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean") Cc: # 4.14 Signed-off-by: Hua Rui Reviewed-by: Michael Lyle Reviewed-by: Coly Li Signed-off-by: Michael Lyle Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 151eae0b40efbc839f0eadfb72032560fc44e2fb Author: Coly Li Date: Mon Oct 30 14:46:31 2017 -0700 bcache: only permit to recovery read error when cache device is clean [ Upstream commit d59b23795933678c9638fd20c942d2b4f3cd6185 ] When bcache does read I/Os, for example in writeback or writethrough mode, if a read request on cache device is failed, bcache will try to recovery the request by reading from cached device. If the data on cached device is not synced with cache device, then requester will get a stale data. For critical storage system like database, providing stale data from recovery may result an application level data corruption, which is unacceptible. With this patch, for a failed read request in writeback or writethrough mode, recovery a recoverable read request only happens when cache device is clean. That is to say, all data on cached device is up to update. For other cache modes in bcache, read request will never hit cached_dev_read_error(), they don't need this patch. Please note, because cache mode can be switched arbitrarily in run time, a writethrough mode might be switched from a writeback mode. Therefore checking dc->has_data in writethrough mode still makes sense. Changelog: V4: Fix parens error pointed by Michael Lyle. v3: By response from Kent Oversteet, he thinks recovering stale data is a bug to fix, and option to permit it is unnecessary. So this version the sysfs file is removed. v2: rename sysfs entry from allow_stale_data_on_failure to allow_stale_data_on_failure, and fix the confusing commit log. v1: initial patch posted. [small change to patch comment spelling by mlyle] Signed-off-by: Coly Li Signed-off-by: Michael Lyle Reported-by: Arne Wolf Reviewed-by: Michael Lyle Cc: Kent Overstreet Cc: Nix Cc: Kai Krakow Cc: Eric Wheeler Cc: Junhui Tang Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit d0dc056013e2601e645f6ac87decf279107199c9 Author: Tang Junhui Date: Wed Sep 6 14:25:53 2017 +0800 bcache: do not subtract sectors_to_gc for bypassed IO [ Upstream commit 69daf03adef5f7bc13e0ac86b4b8007df1767aab ] Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to trigger gc thread. Signed-off-by: tang.junhui Acked-by: Coly Li Reviewed-by: Eric Wheeler Reviewed-by: Christoph Hellwig Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 7633901a23e975811e78ce4bc46e623dc79cbe3b Author: WANG Cong Date: Sat Jun 24 23:50:30 2017 -0700 tcp: reset sk_rx_dst in tcp_disconnect() [ Upstream commit d747a7a51b00984127a88113cdbbc26f91e9d815 ] We have to reset the sk->sk_rx_dst when we disconnect a TCP connection, because otherwise when we re-connect it this dst reference is simply overridden in tcp_finish_connect(). This fixes a dst leak which leads to a loopback dev refcnt leak. It is a long-standing bug, Kevin reported a very similar (if not same) bug before. Thanks to Andrei for providing such a reliable reproducer which greatly narrows down the problem. Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.") Reported-by: Andrei Vagin Reported-by: Kevin Xu Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5615a7bebb369bd9e09f2f1311f70eb8cdae4bb6 Author: WANG Cong Date: Wed Jun 21 14:34:58 2017 -0700 ipv6: avoid unregistering inet6_dev for loopback [ Upstream commit 60abc0be96e00ca71bac083215ac91ad2e575096 ] The per netns loopback_dev->ip6_ptr is unregistered and set to NULL when its mtu is set to smaller than IPV6_MIN_MTU, this leads to that we could set rt->rt6i_idev NULL after a rt6_uncached_list_flush_dev() and then crash after another call. In this case we should just bring its inet6_dev down, rather than unregistering it, at least prior to commit 176c39af29bc ("netns: fix addrconf_ifdown kernel panic") we always override the case for loopback. Thanks a lot to Andrey for finding a reliable reproducer. Fixes: 176c39af29bc ("netns: fix addrconf_ifdown kernel panic") Reported-by: Andrey Konovalov Cc: Andrey Konovalov Cc: Daniel Lezcano Cc: David Ahern Signed-off-by: Cong Wang Acked-by: David Ahern Tested-by: Andrey Konovalov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5ff32f710c40f59ff7021f9e66b76c708515cb48 Author: Russell King Date: Tue May 30 16:21:51 2017 +0100 net: phy: fix marvell phy status reading [ Upstream commit 898805e0cdf7fd860ec21bf661d3a0285a3defbd ] The Marvell driver incorrectly provides phydev->lp_advertising as the logical and of the link partner's advert and our advert. This is incorrect - this field is supposed to store the link parter's unmodified advertisment. This allows ethtool to report the correct link partner auto-negotiation status. Fixes: be937f1f89ca ("Marvell PHY m88e1111 driver fix") Signed-off-by: Russell King Reviewed-by: Andrew Lunn Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 62437482578a9c221c8162e4cb0c3fc7d0f8a605 Author: Kristina Martsenko Date: Wed May 3 16:37:46 2017 +0100 arm64: hw_breakpoint: fix watchpoint matching for tagged pointers [ Upstream commit 7dcd9dd8cebe9fa626af7e2358d03a37041a70fb ] When we take a watchpoint exception, the address that triggered the watchpoint is found in FAR_EL1. We compare it to the address of each configured watchpoint to see which one was hit. The configured watchpoint addresses are untagged, while the address in FAR_EL1 will have an address tag if the data access was done using a tagged address. The tag needs to be removed to compare the address to the watchpoints. Currently we don't remove it, and as a result can report the wrong watchpoint as being hit (specifically, always either the highest TTBR0 watchpoint or lowest TTBR1 watchpoint). This patch removes the tag. Fixes: d50240a5f6ce ("arm64: mm: permit use of tagged pointers at EL0") Cc: # 3.12.x- Acked-by: Mark Rutland Acked-by: Will Deacon Signed-off-by: Kristina Martsenko Signed-off-by: Catalin Marinas Signed-off-by: Sasha Levin commit c19aa530105b0d780ad72a78a7ef271037bcb774 Author: Eric Biggers Date: Thu Jun 8 14:48:40 2017 +0100 KEYS: fix dereferencing NULL payload with nonzero length [ Upstream commit 5649645d725c73df4302428ee4e02c869248b4c5 ] sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a NULL payload with nonzero length to be passed to the key type's ->preparse(), ->instantiate(), and/or ->update() methods. Various key types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did not handle this case, allowing an unprivileged user to trivially cause a NULL pointer dereference (kernel oops) if one of these key types was present. Fix it by doing the copy_from_user() when 'plen' is nonzero rather than when '_payload' is non-NULL, causing the syscall to fail with EFAULT as expected when an invalid buffer is specified. Cc: stable@vger.kernel.org # 2.6.10+ Signed-off-by: Eric Biggers Signed-off-by: David Howells Signed-off-by: James Morris Signed-off-by: Sasha Levin commit a5e71be0786d72497535db1d79678feb9ca7bda5 Author: Gabriel Krisman Bertazi Date: Wed Dec 28 16:42:00 2016 -0200 8250_pci: Fix potential use-after-free in error path [ Upstream commit c130b666a9a711f985a0a44b58699ebe14bb7245 ] Commit f209fa03fc9d ("serial: 8250_pci: Detach low-level driver during PCI error recovery") introduces a potential use-after-free in case the pciserial_init_ports call in serial8250_io_resume fails, which may happen if a memory allocation fails or if the .init quirk failed for whatever reason). If this happen, further pci_get_drvdata will return a pointer to freed memory. This patch reworks the PCI recovery resume hook to restore the old priv structure in this case, which should be ok, since the ports were already detached. Such error during recovery causes us to give up on the recovery. Fixes: f209fa03fc9d ("serial: 8250_pci: Detach low-level driver during PCI error recovery") Reported-by: Michal Suchanek Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Guilherme G. Piccoli Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit c87134f416a6c3b302bed97d38977c9845171d38 Author: Jason A. Donenfeld Date: Thu Mar 23 12:24:43 2017 +0100 padata: avoid race in reordering [ Upstream commit de5540d088fe97ad583cc7d396586437b32149a5 ] Under extremely heavy uses of padata, crashes occur, and with list debugging turned on, this happens instead: [87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33 __list_add+0xae/0x130 [87487.301868] list_add corruption. prev->next should be next (ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00). [87487.339011] [] dump_stack+0x68/0xa3 [87487.342198] [] ? console_unlock+0x281/0x6d0 [87487.345364] [] __warn+0xff/0x140 [87487.348513] [] warn_slowpath_fmt+0x4a/0x50 [87487.351659] [] __list_add+0xae/0x130 [87487.354772] [] ? _raw_spin_lock+0x64/0x70 [87487.357915] [] padata_reorder+0x1e6/0x420 [87487.361084] [] padata_do_serial+0xa5/0x120 padata_reorder calls list_add_tail with the list to which its adding locked, which seems correct: spin_lock(&squeue->serial.lock); list_add_tail(&padata->list, &squeue->serial.list); spin_unlock(&squeue->serial.lock); This therefore leaves only place where such inconsistency could occur: if padata->list is added at the same time on two different threads. This pdata pointer comes from the function call to padata_get_next(pd), which has in it the following block: next_queue = per_cpu_ptr(pd->pqueue, cpu); padata = NULL; reorder = &next_queue->reorder; if (!list_empty(&reorder->list)) { padata = list_entry(reorder->list.next, struct padata_priv, list); spin_lock(&reorder->lock); list_del_init(&padata->list); atomic_dec(&pd->reorder_objects); spin_unlock(&reorder->lock); pd->processed++; goto out; } out: return padata; I strongly suspect that the problem here is that two threads can race on reorder list. Even though the deletion is locked, call to list_entry is not locked, which means it's feasible that two threads pick up the same padata object and subsequently call list_add_tail on them at the same time. The fix is thus be hoist that lock outside of that block. Signed-off-by: Jason A. Donenfeld Acked-by: Steffen Klassert Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit b3282b3115cdd97609cbf3491995c5cc2bb57660 Author: David Hildenbrand Date: Thu Mar 23 18:24:19 2017 +0100 KVM: kvm_io_bus_unregister_dev() should never fail [ Upstream commit 90db10434b163e46da413d34db8d0e77404cc645 ] No caller currently checks the return value of kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on freeing their device. A stale reference will remain in the io_bus, getting at least used again, when the iobus gets teared down on kvm_destroy_vm() - leading to use after free errors. There is nothing the callers could do, except retrying over and over again. So let's simply remove the bus altogether, print an error and make sure no one can access this broken bus again (returning -ENOMEM on any attempt to access it). Fixes: e93f8a0f821e ("KVM: convert io_bus to SRCU") Cc: stable@vger.kernel.org # 3.4+ Reported-by: Dmitry Vyukov Reviewed-by: Cornelia Huck Signed-off-by: David Hildenbrand Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 4decce66d9e0542d75debac85998aa68b6753504 Author: Uwe Kleine-König Date: Sat Jul 2 17:28:10 2016 +0200 rtc: s35390a: improve irq handling [ Upstream commit 3bd32722c827d00eafe8e6d5b83e9f3148ea7c7e ] On some QNAP NAS devices the rtc can wake the machine. Several people noticed that once the machine was woken this way it fails to shut down. That's because the driver fails to acknowledge the interrupt and so it keeps active and restarts the machine immediatly after shutdown. See https://bugs.debian.org/794266 for a bug report. Doing this correctly requires to interpret the INT2 flag of the first read of the STATUS1 register because this bit is cleared by read. Note this is not maximally robust though because a pending irq isn't detected when the STATUS1 register was already read (and so INT2 is not set) but the irq was not disabled. But that is a hardware imposed problem that cannot easily be fixed by software. Signed-off-by: Uwe Kleine-König Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin commit a740217c448eb1b16b7bca95d55d212e6772bf44 Author: Uwe Kleine-König Date: Sat Jul 2 17:28:09 2016 +0200 rtc: s35390a: implement reset routine as suggested by the reference [ Upstream commit 8e6583f1b5d1f5f129b873f1428b7e414263d847 ] There were two deviations from the reference manual: you have to wait half a second when POC is active and you might have to repeat initialization when POC or BLD are still set after the sequence. Note however that as POC and BLD are cleared by read the driver might not be able to detect that a reset is necessary. I don't have a good idea how to fix this. Additionally report the value read from STATUS1 to the caller. This prepares the next patch. Signed-off-by: Uwe Kleine-König Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin commit 7b1ae699502eb7ef1ac9911567d199b23420c87d Author: Uwe Kleine-König Date: Sat Jul 2 17:28:08 2016 +0200 rtc: s35390a: fix reading out alarm [ Upstream commit f87e904ddd8f0ef120e46045b0addeb1cc88354e ] There are several issues fixed in this patch: - When alarm isn't enabled, set .enabled to zero instead of returning -EINVAL. - Ignore how IRQ1 is configured when determining if IRQ2 is on. - The three alarm registers have an enable flag which must be evaluated. - The chip always triggers when the seconds register gets 0. Note that the rtc framework however doesn't handle the result correctly because it doesn't check wday being initialized and so interprets an alarm being set for 10:00 AM in three days as 10:00 AM tomorrow (or today if that's not over yet). Signed-off-by: Uwe Kleine-König Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin commit 74f74c6040e0d6dd326793e7ad2976af33baba73 Author: Felix Fietkau Date: Thu Jan 19 12:28:22 2017 +0100 MIPS: Lantiq: Fix cascaded IRQ setup [ Upstream commit 6c356eda225e3ee134ed4176b9ae3a76f793f4dd ] With the IRQ stack changes integrated, the XRX200 devices started emitting a constant stream of kernel messages like this: [ 565.415310] Spurious IRQ: CAUSE=0x1100c300 This is caused by IP0 getting handled by plat_irq_dispatch() rather than its vectored interrupt handler, which is fixed by commit de856416e714 ("MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch"). Fix plat_irq_dispatch() to handle non-vectored IPI interrupts correctly by setting up IP2-6 as proper chained IRQ handlers and calling do_IRQ for all MIPS CPU interrupts. Signed-off-by: Felix Fietkau Acked-by: John Crispin Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15077/ [james.hogan@imgtec.com: tweaked commit message] Signed-off-by: James Hogan Signed-off-by: Sasha Levin commit b5fa0ad35bff76260919cd94812bcd4d0af5eadc Author: Peter Xu Date: Wed Mar 15 16:01:17 2017 +0800 KVM: x86: clear bus pointer when destroyed [ Upstream commit df630b8c1e851b5e265dc2ca9c87222e342c093b ] When releasing the bus, let's clear the bus pointers to mark it out. If any further device unregister happens on this bus, we know that we're done if we found the bus being released already. Signed-off-by: Peter Xu Signed-off-by: Radim Krčmář Signed-off-by: Sasha Levin commit 269e73f9e8bf03f6bbb139b16e53d29e7f45e372 Author: Ilya Dryomov Date: Tue Mar 21 13:44:28 2017 +0100 libceph: force GFP_NOIO for socket allocations [ Upstream commit 633ee407b9d15a75ac9740ba9d3338815e1fcb95 ] sock_alloc_inode() allocates socket+inode and socket_wq with GFP_KERNEL, which is not allowed on the writeback path: Workqueue: ceph-msgr con_work [libceph] ffff8810871cb018 0000000000000046 0000000000000000 ffff881085d40000 0000000000012b00 ffff881025cad428 ffff8810871cbfd8 0000000000012b00 ffff880102fc1000 ffff881085d40000 ffff8810871cb038 ffff8810871cb148 Call Trace: [] schedule+0x29/0x70 [] schedule_timeout+0x1bd/0x200 [] ? ttwu_do_wakeup+0x2c/0x120 [] ? ttwu_do_activate.constprop.135+0x66/0x70 [] wait_for_completion+0xbf/0x180 [] ? try_to_wake_up+0x390/0x390 [] flush_work+0x165/0x250 [] ? worker_detach_from_pool+0xd0/0xd0 [] xlog_cil_force_lsn+0x81/0x200 [xfs] [] ? __slab_free+0xee/0x234 [] _xfs_log_force_lsn+0x4d/0x2c0 [xfs] [] ? lookup_page_cgroup_used+0xe/0x30 [] ? xfs_reclaim_inode+0xa3/0x330 [xfs] [] xfs_log_force_lsn+0x3f/0xf0 [xfs] [] ? xfs_reclaim_inode+0xa3/0x330 [xfs] [] xfs_iunpin_wait+0xc6/0x1a0 [xfs] [] ? wake_atomic_t_function+0x40/0x40 [] xfs_reclaim_inode+0xa3/0x330 [xfs] [] xfs_reclaim_inodes_ag+0x257/0x3d0 [xfs] [] xfs_reclaim_inodes_nr+0x33/0x40 [xfs] [] xfs_fs_free_cached_objects+0x15/0x20 [xfs] [] super_cache_scan+0x178/0x180 [] shrink_slab_node+0x14e/0x340 [] ? mem_cgroup_iter+0x16b/0x450 [] shrink_slab+0x100/0x140 [] do_try_to_free_pages+0x335/0x490 [] try_to_free_pages+0xb9/0x1f0 [] ? __alloc_pages_direct_compact+0x69/0x1be [] __alloc_pages_nodemask+0x69a/0xb40 [] alloc_pages_current+0x9e/0x110 [] new_slab+0x2c5/0x390 [] __slab_alloc+0x33b/0x459 [] ? sock_alloc_inode+0x2d/0xd0 [] ? inet_sendmsg+0x71/0xc0 [] ? sock_alloc_inode+0x2d/0xd0 [] kmem_cache_alloc+0x1a2/0x1b0 [] sock_alloc_inode+0x2d/0xd0 [] alloc_inode+0x26/0xa0 [] new_inode_pseudo+0x1a/0x70 [] sock_alloc+0x1e/0x80 [] __sock_create+0x95/0x220 [] sock_create_kern+0x24/0x30 [] con_work+0xef9/0x2050 [libceph] [] ? rbd_img_request_submit+0x4c/0x60 [rbd] [] process_one_work+0x159/0x4f0 [] worker_thread+0x11b/0x530 [] ? create_worker+0x1d0/0x1d0 [] kthread+0xc9/0xe0 [] ? flush_kthread_worker+0x90/0x90 [] ret_from_fork+0x58/0x90 [] ? flush_kthread_worker+0x90/0x90 Use memalloc_noio_{save,restore}() to temporarily force GFP_NOIO here. Cc: stable@vger.kernel.org # 3.10+, needs backporting Link: http://tracker.ceph.com/issues/19309 Reported-by: Sergey Jerusalimov Signed-off-by: Ilya Dryomov Reviewed-by: Jeff Layton Signed-off-by: Sasha Levin commit 21af28db5ef7c8bed49f9dfaaa9ad410032b635e Author: Dave Martin Date: Mon Mar 27 15:10:57 2017 +0100 metag/ptrace: Reject partial NT_METAG_RPIPE writes [ Upstream commit 7195ee3120d878259e8d94a5d9f808116f34d5ea ] It's not clear what behaviour is sensible when doing partial write of NT_METAG_RPIPE, so just don't bother. This patch assumes that userspace will never rely on a partial SETREGSET in this case, since it's not clear what should happen anyway. Signed-off-by: Dave Martin Acked-by: James Hogan Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 6779b4582eca8357ee731e95185c0f9f20abf71b Author: Dave Martin Date: Mon Mar 27 15:10:56 2017 +0100 metag/ptrace: Provide default TXSTATUS for short NT_PRSTATUS [ Upstream commit 5fe81fe98123ce41265c65e95d34418d30d005d1 ] Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill TXSTATUS, a well-defined default value is used, based on the task's current value. Suggested-by: James Hogan Signed-off-by: Dave Martin Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit e77d1491b2a00f9edfbcd53c232ad714afca87a2 Author: Dave Martin Date: Mon Mar 27 15:10:55 2017 +0100 metag/ptrace: Preserve previous registers for short regset write [ Upstream commit a78ce80d2c9178351b34d78fec805140c29c193e ] Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. Signed-off-by: Dave Martin Acked-by: James Hogan Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 3da09f082fb3fa5789b4c6bb5abd90c392cf7d26 Author: Dave Martin Date: Mon Mar 27 15:10:59 2017 +0100 sparc/ptrace: Preserve previous registers for short regset write [ Upstream commit d3805c546b275c8cc7d40f759d029ae92c7175f2 ] Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. Signed-off-by: Dave Martin Acked-by: David S. Miller Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 0585334e230df59103088ce48f018a19fdd8f664 Author: Dave Martin Date: Mon Mar 27 15:10:58 2017 +0100 mips/ptrace: Preserve previous registers for short regset write [ Upstream commit d614fd58a2834cfe4efa472c33c8f3ce2338b09b ] Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET to fill all the registers, the thread's old registers are preserved. Signed-off-by: Dave Martin Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 48dd0fe903bfb00cc7460eb57ac33f24451dcb35 Author: Dave Martin Date: Mon Mar 27 15:10:53 2017 +0100 c6x/ptrace: Remove useless PTRACE_SETREGSET implementation [ Upstream commit fb411b837b587a32046dc4f369acb93a10b1def8 ] gpr_set won't work correctly and can never have been tested, and the correct behaviour is not clear due to the endianness-dependent task layout. So, just remove it. The core code will now return -EOPNOTSUPPORT when trying to set NT_PRSTATUS on this architecture until/unless a correct implementation is supplied. Signed-off-by: Dave Martin Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 171a957989c4f398adb41078dbfff73b0821e383 Author: Andy Whitcroft Date: Thu Mar 23 07:45:44 2017 +0000 xfrm_user: validate XFRM_MSG_NEWAE incoming ESN size harder [ Upstream commit f843ee6dd019bcece3e74e76ad9df0155655d0df ] Kees Cook has pointed out that xfrm_replay_state_esn_len() is subject to wrapping issues. To ensure we are correctly ensuring that the two ESN structures are the same size compare both the overall size as reported by xfrm_replay_state_esn_len() and the internal length are the same. CVE-2017-7184 Signed-off-by: Andy Whitcroft Acked-by: Steffen Klassert Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 438db92d7f2792e3bad17be70e6edf0f44a081f0 Author: Andy Whitcroft Date: Wed Mar 22 07:29:31 2017 +0000 xfrm_user: validate XFRM_MSG_NEWAE XFRMA_REPLAY_ESN_VAL replay_window [ Upstream commit 677e806da4d916052585301785d847c3b3e6186a ] When a new xfrm state is created during an XFRM_MSG_NEWSA call we validate the user supplied replay_esn to ensure that the size is valid and to ensure that the replay_window size is within the allocated buffer. However later it is possible to update this replay_esn via a XFRM_MSG_NEWAE call. There we again validate the size of the supplied buffer matches the existing state and if so inject the contents. We do not at this point check that the replay_window is within the allocated memory. This leads to out-of-bounds reads and writes triggered by netlink packets. This leads to memory corruption and the potential for priviledge escalation. We already attempt to validate the incoming replay information in xfrm_new_ae() via xfrm_replay_verify_len(). This confirms that the user is not trying to change the size of the replay state buffer which includes the replay_esn. It however does not check the replay_window remains within that buffer. Add validation of the contained replay_window. CVE-2017-7184 Signed-off-by: Andy Whitcroft Acked-by: Steffen Klassert Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 5847973a96fdce329f1f589b481d6373d3e692e5 Author: Florian Westphal Date: Wed Feb 8 11:52:29 2017 +0100 xfrm: policy: init locks early [ Upstream commit c282222a45cb9503cbfbebfdb60491f06ae84b49 ] Dmitry reports following splat: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 13059 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170207 #1 [..] spin_lock_bh include/linux/spinlock.h:304 [inline] xfrm_policy_flush+0x32/0x470 net/xfrm/xfrm_policy.c:963 xfrm_policy_fini+0xbf/0x560 net/xfrm/xfrm_policy.c:3041 xfrm_net_init+0x79f/0x9e0 net/xfrm/xfrm_policy.c:3091 ops_init+0x10a/0x530 net/core/net_namespace.c:115 setup_net+0x2ed/0x690 net/core/net_namespace.c:291 copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396 create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106 unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205 SYSC_unshare kernel/fork.c:2281 [inline] Problem is that when we get error during xfrm_net_init we will call xfrm_policy_fini which will acquire xfrm_policy_lock before it was initialized. Just move it around so locks get set up first. Reported-by: Dmitry Vyukov Fixes: 283bc9f35bbbcb0e9 ("xfrm: Namespacify xfrm state/policy locks") Signed-off-by: Florian Westphal Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit 2832213a871a607243a257dd5b5015be7c2d9d57 Author: Jiri Slaby Date: Thu Dec 15 14:31:01 2016 +0100 crypto: algif_hash - avoid zero-sized array [ Upstream commit 6207119444595d287b1e9e83a2066c17209698f3 ] With this reproducer: struct sockaddr_alg alg = { .salg_family = 0x26, .salg_type = "hash", .salg_feat = 0xf, .salg_mask = 0x5, .salg_name = "digest_null", }; int sock, sock2; sock = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(sock, (struct sockaddr *)&alg, sizeof(alg)); sock2 = accept(sock, NULL, NULL); setsockopt(sock, SOL_ALG, ALG_SET_KEY, "\x9b\xca", 2); accept(sock2, NULL, NULL); ==== 8< ======== 8< ======== 8< ======== 8< ==== one can immediatelly see an UBSAN warning: UBSAN: Undefined behaviour in crypto/algif_hash.c:187:7 variable length array bound value 0 <= 0 CPU: 0 PID: 15949 Comm: syz-executor Tainted: G E 4.4.30-0-default #1 ... Call Trace: ... [] ? __ubsan_handle_vla_bound_not_positive+0x13d/0x188 [] ? __ubsan_handle_out_of_bounds+0x1bc/0x1bc [] ? hash_accept+0x5bd/0x7d0 [algif_hash] [] ? hash_accept_nokey+0x3f/0x51 [algif_hash] [] ? hash_accept_parent_nokey+0x4a0/0x4a0 [algif_hash] [] ? SyS_accept+0x2b/0x40 It is a correct warning, as hash state is propagated to accept as zero, but creating a zero-length variable array is not allowed in C. Fix this as proposed by Herbert -- do "?: 1" on that site. No sizeof or similar happens in the code there, so we just allocate one byte even though we do not use the array. Signed-off-by: Jiri Slaby Cc: Herbert Xu Cc: "David S. Miller" (maintainer:CRYPTO API) Reported-by: Sasha Levin Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 4ebd212334f5fb10d3ea458f48c2520bd437a943 Author: Takashi Iwai Date: Wed Jan 11 17:09:50 2017 +0100 fbcon: Fix vc attr at deinit [ Upstream commit 8aac7f34369726d1a158788ae8aff3002d5eb528 ] fbcon can deal with vc_hi_font_mask (the upper 256 chars) and adjust the vc attrs dynamically when vc_hi_font_mask is changed at fbcon_init(). When the vc_hi_font_mask is set, it remaps the attrs in the existing console buffer with one bit shift up (for 9 bits), while it remaps with one bit shift down (for 8 bits) when the value is cleared. It works fine as long as the font gets updated after fbcon was initialized. However, we hit a bizarre problem when the console is switched to another fb driver (typically from vesafb or efifb to drmfb). At switching to the new fb driver, we temporarily rebind the console to the dummy console, then rebind to the new driver. During the switching, we leave the modified attrs as is. Thus, the new fbcon takes over the old buffer as if it were to contain 8 bits chars (although the attrs are still shifted for 9 bits), and effectively this results in the yellow color texts instead of the original white color, as found in the bugzilla entry below. An easy fix for this is to re-adjust the attrs before leaving the fbcon at con_deinit callback. Since the code to adjust the attrs is already present in the current fbcon code, in this patch, we simply factor out the relevant code, and call it from fbcon_deinit(). Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000619 Signed-off-by: Takashi Iwai Signed-off-by: Bartlomiej Zolnierkiewicz Signed-off-by: Sasha Levin commit 7d6b48119b43368dd3f886c84d16361eefb84f10 Author: Sumit Semwal Date: Sat Mar 25 21:48:19 2017 +0530 serial: 8250_pci: Detach low-level driver during PCI error recovery [ Upstream commit f209fa03fc9d131b3108c2e4936181eabab87416 ] During a PCI error recovery, like the ones provoked by EEH in the ppc64 platform, all IO to the device must be blocked while the recovery is completed. Current 8250_pci implementation only suspends the port instead of detaching it, which doesn't prevent incoming accesses like TIOCMGET and TIOCMSET calls from reaching the device. Those end up racing with the EEH recovery, crashing it. Similar races were also observed when opening the device and when shutting it down during recovery. This patch implements a more robust IO blockage for the 8250_pci recovery by unregistering the port at the beginning of the procedure and re-adding it afterwards. Since the port is detached from the uart layer, we can be sure that no request will make through to the device during recovery. This is similar to the solution used by the JSM serial driver. I thank Peter Hurley for valuable input on this one over one year ago. Signed-off-by: Gabriel Krisman Bertazi Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 76286fd0010e64fb434a2f3fdfdc051f3fdd8668 Author: Sumit Semwal Date: Sat Mar 25 21:48:16 2017 +0530 [media] uvcvideo: uvc_scan_fallback() for webcams with broken chain [ Upstream commit e950267ab802c8558f1100eafd4087fd039ad634 ] Some devices have invalid baSourceID references, causing uvc_scan_chain() to fail, but if we just take the entities we can find and put them together in the most sensible chain we can think of, turns out they do work anyway. Note: This heuristic assumes there is a single chain. At the time of writing, devices known to have such a broken chain are - Acer Integrated Camera (5986:055a) - Realtek rtl157a7 (0bda:57a7) Signed-off-by: Henrik Ingo Signed-off-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit fe51f26e30ab35a5a50df6d4163818c452e9683d Author: Sumit Semwal Date: Sat Mar 25 21:48:14 2017 +0530 block: allow WRITE_SAME commands with the SG_IO ioctl [ Upstream commit 25cdb64510644f3e854d502d69c73f21c6df88a9 ] The WRITE_SAME commands are not present in the blk_default_cmd_filter write_ok list, and thus are failed with -EPERM when the SG_IO ioctl() is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users). [ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ] The problem can be reproduced with the sg_write_same command # sg_write_same --num 1 --xferlen 512 /dev/sda # # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_same --num 1 --xferlen 512 /dev/sda' Write same: pass through os error: Operation not permitted # For comparison, the WRITE_VERIFY command does not observe this problem, since it is in that list: # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda' # So, this patch adds the WRITE_SAME commands to the list, in order for the SG_IO ioctl to finish successfully: # capsh --drop=cap_sys_rawio -- -c \ 'sg_write_same --num 1 --xferlen 512 /dev/sda' # That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices (qemu "-device scsi-block" [1], libvirt "" [2]), which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu). In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls, which are translated to write-same calls in the guest kernel, and then into SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest: [...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current] [...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated [...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00 [...] blk_update_request: I/O error, dev sda, sector 17096824 Links: [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52 [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device') Signed-off-by: Mauricio Faria de Oliveira Signed-off-by: Brahadambal Srinivasan Reported-by: Manjunatha H R Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 51cbcace401cc742bc16b83eebfa697811098d1e Author: Sumit Semwal Date: Sat Mar 25 21:48:12 2017 +0530 PCI: Do any VF BAR updates before enabling the BARs [ Upstream commit f40ec3c748c6912f6266c56a7f7992de61b255ed ] Previously we enabled VFs and enable their memory space before calling pcibios_sriov_enable(). But pcibios_sriov_enable() may update the VF BARs: for example, on PPC PowerNV we may change them to manage the association of VFs to PEs. Because 64-bit BARs cannot be updated atomically, it's unsafe to update them while they're enabled. The half-updated state may conflict with other devices in the system. Call pcibios_sriov_enable() before enabling the VFs so any BAR updates happen while the VF BARs are disabled. [bhelgaas: changelog] Tested-by: Carol Soto Signed-off-by: Gavin Shan Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin commit eeaf3677ba86c9eb95799358faa9c53d0354bf38 Author: Sumit Semwal Date: Sat Mar 25 21:48:10 2017 +0530 PCI: Update BARs using property bits appropriate for type [ Upstream commit 45d004f4afefdd8d79916ee6d97a9ecd94bb1ffe ] The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed to be read-only, but we do save them in res->flags and include them when updating the BAR. Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits 2-3 of I/O addresses. Use PCI_ROM_ADDRESS_MASK for ROM BARs. This means we'll only check the top 21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if the update was successful. Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin commit e7c37e14662329dbb0a0a117682374fc20beb987 Author: Sumit Semwal Date: Sat Mar 25 21:48:09 2017 +0530 PCI: Don't update VF BARs while VF memory space is enabled [ Upstream commit 546ba9f8f22f71b0202b6ba8967be5cc6dae4e21 ] If we update a VF BAR while it's enabled, there are two potential problems: 1) Any driver that's using the VF has a cached BAR value that is stale after the update, and 2) We can't update 64-bit BARs atomically, so the intermediate state (new lower dword with old upper dword) may conflict with another device, and an access by a driver unrelated to the VF may cause a bus error. Warn about attempts to update VF BARs while they are enabled. This is a programming error, so use dev_WARN() to get a backtrace. Signed-off-by: Bjorn Helgaas Reviewed-by: Gavin Shan Signed-off-by: Sasha Levin commit d388267762adcee75048174a3d2861bbe5dff7c7 Author: Sumit Semwal Date: Sat Mar 25 21:48:08 2017 +0530 PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE [ Upstream commit 7a6d312b50e63f598f5b5914c4fd21878ac2b595 ] Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE. PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if we're reading or writing a BAR register value, that's what we should use. IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags. Signed-off-by: Bjorn Helgaas Reviewed-by: Gavin Shan Signed-off-by: Sasha Levin commit d7b502b513fbd247fe5b8fd64858750ea5fe0028 Author: Sumit Semwal Date: Sat Mar 25 21:48:07 2017 +0530 PCI: Add comments about ROM BAR updating [ Upstream commit 0b457dde3cf8b7c76a60f8e960f21bbd4abdc416 ] pci_update_resource() updates a hardware BAR so its address matches the kernel's struct resource UNLESS it's a disabled ROM BAR. We only update those when we enable the ROM. It's not obvious from the code why ROM BARs should be handled specially. Apparently there are Matrox devices with defective ROM BARs that read as zero when disabled. That means that if pci_enable_rom() reads the disabled BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and writes it back, it would enable the ROM at address zero. Add comments and references to explain why we can't make the code look more rational. The code changes are from 755528c860b0 ("Ignore disabled ROM resources at setup") and 8085ce084c0f ("[PATCH] Fix PCI ROM mapping"). Link: https://lkml.org/lkml/2005/8/30/138 Signed-off-by: Bjorn Helgaas Reviewed-by: Gavin Shan Signed-off-by: Sasha Levin commit c6268b0a8d1d1169f876245967ebb8ffffdcb640 Author: Sumit Semwal Date: Sat Mar 25 21:48:06 2017 +0530 PCI: Remove pci_resource_bar() and pci_iov_resource_bar() [ Upstream commit 286c2378aaccc7343ebf17ec6cd86567659caf70 ] pci_std_update_resource() only deals with standard BARs, so we don't have to worry about the complications of VF BARs in an SR-IOV capability. Compute the BAR address inline and remove pci_resource_bar(). That makes pci_iov_resource_bar() unused, so remove that as well. Signed-off-by: Bjorn Helgaas Reviewed-by: Gavin Shan Signed-off-by: Sasha Levin commit e71f476f1f27be45803526c3cd44ad87a86e4a0d Author: Sumit Semwal Date: Sat Mar 25 21:48:05 2017 +0530 PCI: Separate VF BAR updates from standard BAR updates [ Upstream commit 6ffa2489c51da77564a0881a73765ea2169f955d ] Previously pci_update_resource() used the same code path for updating standard BARs and VF BARs in SR-IOV capabilities. Split the VF BAR update into a new pci_iov_update_resource() internal interface, which makes it simpler to compute the BAR address (we can get rid of pci_resource_bar() and pci_iov_resource_bar()). This patch: - Renames pci_update_resource() to pci_std_update_resource(), - Adds pci_iov_update_resource(), - Makes pci_update_resource() a wrapper that calls the appropriate one, No functional change intended. Signed-off-by: Bjorn Helgaas Reviewed-by: Gavin Shan Signed-off-by: Sasha Levin commit b6c8db12dc590c49e838bf9b84146d05438d30a5 Author: Sumit Semwal Date: Sat Mar 25 21:48:03 2017 +0530 igb: add i211 to i210 PHY workaround [ Upstream commit 5bc8c230e2a993b49244f9457499f17283da9ec7 ] i210 and i211 share the same PHY but have different PCI IDs. Don't forget i211 for any i210 workarounds. Signed-off-by: Todd Fujinaka Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin commit 215aa8c6f01795449458831a878d17ad57b65ac8 Author: Sumit Semwal Date: Sat Mar 25 21:48:02 2017 +0530 igb: Workaround for igb i210 firmware issue [ Upstream commit 4e684f59d760a2c7c716bb60190783546e2d08a1 ] Sometimes firmware may not properly initialize I347AT4_PAGE_SELECT causing the probe of an igb i210 NIC to fail. This patch adds an addition zeroing of this register during igb_get_phy_id to workaround this issue. Thanks for Jochen Henneberg for the idea and original patch. Signed-off-by: Chris J Arges Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher Signed-off-by: Sasha Levin commit 901a56cd88a89a81a923e5cd588e21fa5b780bbd Author: Darrick J. Wong Date: Wed Jan 25 20:24:57 2017 -0800 xfs: clear _XBF_PAGES from buffers when readahead page [ Upstream commit 2aa6ba7b5ad3189cc27f14540aa2f57f0ed8df4b ] If we try to allocate memory pages to back an xfs_buf that we're trying to read, it's possible that we'll be so short on memory that the page allocation fails. For a blocking read we'll just wait, but for readahead we simply dump all the pages we've collected so far. Unfortunately, after dumping the pages we neglect to clear the _XBF_PAGES state, which means that the subsequent call to xfs_buf_free thinks that b_pages still points to pages we own. It then double-frees the b_pages pages. This results in screaming about negative page refcounts from the memory manager, which xfs oughtn't be triggering. To reproduce this case, mount a filesystem where the size of the inodes far outweighs the availalble memory (a ~500M inode filesystem on a VM with 300MB memory did the trick here) and run bulkstat in parallel with other memory eating processes to put a huge load on the system. The "check summary" phase of xfs_scrub also works for this purpose. Signed-off-by: Darrick J. Wong Reviewed-by: Eric Sandeen Signed-off-by: Sasha Levin commit 5e36a29d1d6eaf438fa835f1875cec72156976e1 Author: Darrick J. Wong Date: Mon Dec 5 12:38:38 2016 +1100 xfs: don't allow di_size with high bit set [ Upstream commit ef388e2054feedaeb05399ed654bdb06f385d294 ] The on-disk field di_size is used to set i_size, which is a signed integer of loff_t. If the high bit of di_size is set, we'll end up with a negative i_size, which will cause all sorts of problems. Since the VFS won't let us create a file with such length, we should catch them here in the verifier too. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Signed-off-by: Dave Chinner Signed-off-by: Sasha Levin commit da6d71ba4926027d1ccee806286584061ba99edd Author: Ilya Dryomov Date: Wed Mar 1 17:33:27 2017 +0100 libceph: don't set weight to IN when OSD is destroyed [ Upstream commit b581a5854eee4b7851dedb0f8c2ceb54fb902c06 ] Since ceph.git commit 4e28f9e63644 ("osd/OSDMap: clear osd_info, osd_xinfo on osd deletion"), weight is set to IN when OSD is deleted. This changes the result of applying an incremental for clients, not just OSDs. Because CRUSH computations are obviously affected, pre-4e28f9e63644 servers disagree with post-4e28f9e63644 clients on object placement, resulting in misdirected requests. Mirrors ceph.git commit a6009d1039a55e2c77f431662b3d6cc5a8e8e63f. Fixes: 930c53286977 ("libceph: apply new_state before new_up_client on incrementals") Link: http://tracker.ceph.com/issues/19122 Signed-off-by: Ilya Dryomov Reviewed-by: Sage Weil Signed-off-by: Sasha Levin commit 8a40cc46a2303c4978143022b23dd7ec6c1aec71 Author: Tomasz Majchrzak Date: Thu Jul 28 10:28:25 2016 +0200 raid10: increment write counter after bio is split [ Upstream commit 9b622e2bbcf049c82e2550d35fb54ac205965f50 ] md pending write counter must be incremented after bio is split, otherwise it gets decremented too many times in end bio callback and becomes negative. Signed-off-by: Tomasz Majchrzak Reviewed-by: Artur Paszkiewicz Signed-off-by: Shaohua Li Signed-off-by: Sasha Levin commit e9056085f2970aa35db93705f44e01401334516f Author: Song Hongyan Date: Wed Feb 22 17:17:38 2017 +0800 iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3 [ Upstream commit 3bec247474469f769af41e8c80d3a100dd97dd76 ] In function _hid_sensor_power_state(), when hid_sensor_read_poll_value() is called, sensor's all properties will be updated by the value from sensor hardware/firmware. In some implementation, sensor hardware/firmware will do a power cycle during S3. In this case, after resume, once hid_sensor_read_poll_value() is called, sensor's all properties which are kept by driver during S3 will be changed to default value. But instead, if a set feature function is called first, sensor hardware/firmware will be recovered to the last status. So change the sensor_hub_set_feature() calling order to behind of set feature function to avoid sensor properties lose. Signed-off-by: Song Hongyan Acked-by: Srinivas Pandruvada Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit 36e0d023b278834bed53b709691b9a5e9d3f3a58 Author: Michael Engl Date: Tue Oct 3 13:57:00 2017 +0100 iio: adc: ti_am335x_adc: fix fifo overrun recovery [ Upstream commit e83bb3e6f3efa21f4a9d883a25d0ecd9dfb431e1 ] The tiadc_irq_h(int irq, void *private) function is handling FIFO overruns by clearing flags, disabling and enabling the ADC to recover. If the ADC is running in continuous mode a FIFO overrun happens regularly. If the disabling of the ADC happens concurrently with a new conversion. It might happen that the enabling of the ADC is ignored by the hardware. This stops the ADC permanently. No more interrupts are triggered. According to the AM335x Reference Manual (SPRUH73H October 2011 - Revised April 2013 - Chapter 12.4 and 12.5) it is necessary to check the ADC FSM bits in REG_ADCFSM before enabling the ADC again. Because the disabling of the ADC is done right after the current conversion has been finished. To trigger this bug it is necessary to run the ADC in continuous mode. The ADC values of all channels need to be read in an endless loop. The bug appears within the first 6 hours (~5.4 million handled FIFO overruns). The user space application will hang on reading new values from the character device. Fixes: ca9a563805f7a ("iio: ti_am335x_adc: Add continuous sampling support") Signed-off-by: Michael Engl Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit 8725f73f3577b973997c33af5afaab0fc803b382 Author: Eric Dumazet Date: Wed Mar 22 08:10:21 2017 -0700 tcp: initialize icsk_ack.lrcvtime at session start time [ Upstream commit 15bb7745e94a665caf42bfaabf0ce062845b533b ] icsk_ack.lrcvtime has a 0 value at socket creation time. tcpi_last_data_recv can have bogus value if no payload is ever received. This patch initializes icsk_ack.lrcvtime for active sessions in tcp_finish_connect(), and for passive sessions in tcp_create_openreq_child() Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit a96f1d23bd8e22a86dea69847f3f4b97f0d8dd56 Author: Eric Dumazet Date: Tue Mar 21 19:22:28 2017 -0700 ipv4: provide stronger user input validation in nl_fib_input() [ Upstream commit c64c0b3cac4c5b8cb093727d2c19743ea3965c0b ] Alexander reported a KMSAN splat caused by reads of uninitialized field (tb_id_in) from user provided struct fib_result_nl It turns out nl_fib_input() sanity tests on user input is a bit wrong : User can pretend nlh->nlmsg_len is big enough, but provide at sendmsg() time a too small buffer. Reported-by: Alexander Potapenko Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit bc2ae9441bd27851611b06c0cf85bd1badbfb5e2 Author: Maor Gottlieb Date: Tue Mar 21 15:59:17 2017 +0200 net/mlx5: Increase number of max QPs in default profile [ Upstream commit 5f40b4ed975c26016cf41953b7510fe90718e21c ] With ConnectX-4 sharing SRQs from the same space as QPs, we hit a limit preventing some applications to allocate needed QPs amount. Double the size to 256K. Fixes: e126ba97dba9e ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Maor Gottlieb Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 47e516803adab641f9f63294b6320bdca4cc2b24 Author: Andrey Ulanov Date: Tue Mar 14 20:16:42 2017 -0700 net: unix: properly re-increment inflight counter of GC discarded candidates [ Upstream commit 7df9c24625b9981779afb8fcdbe2bb4765e61147 ] Dmitry has reported that a BUG_ON() condition in unix_notinflight() may be triggered by a simple code that forwards unix socket in an SCM_RIGHTS message. That is caused by incorrect unix socket GC implementation in unix_gc(). The GC first collects list of candidates, then (a) decrements their "children's" inflight counter, (b) checks which inflight counters are now 0, and then (c) increments all inflight counters back. (a) and (c) are done by calling scan_children() with inc_inflight or dec_inflight as the second argument. Commit 6209344f5a37 ("net: unix: fix inflight counting bug in garbage collector") changed scan_children() such that it no longer considers sockets that do not have UNIX_GC_CANDIDATE flag. It also added a block of code that that unsets this flag _before_ invoking scan_children(, dec_iflight, ). This may lead to incorrect inflight counters for some sockets. This change fixes this bug by changing order of operations: UNIX_GC_CANDIDATE is now unset only after all inflight counters are restored to the original state. kernel BUG at net/unix/garbage.c:149! RIP: 0010:[] [] unix_notinflight+0x3b4/0x490 net/unix/garbage.c:149 Call Trace: [] unix_detach_fds.isra.19+0xff/0x170 net/unix/af_unix.c:1487 [] unix_destruct_scm+0xf9/0x210 net/unix/af_unix.c:1496 [] skb_release_head_state+0x101/0x200 net/core/skbuff.c:655 [] skb_release_all+0x1a/0x60 net/core/skbuff.c:668 [] __kfree_skb+0x1a/0x30 net/core/skbuff.c:684 [] kfree_skb+0x184/0x570 net/core/skbuff.c:705 [] unix_release_sock+0x5b5/0xbd0 net/unix/af_unix.c:559 [] unix_release+0x49/0x90 net/unix/af_unix.c:836 [] sock_release+0x92/0x1f0 net/socket.c:570 [] sock_close+0x1b/0x20 net/socket.c:1017 [] __fput+0x34e/0x910 fs/file_table.c:208 [] ____fput+0x1a/0x20 fs/file_table.c:244 [] task_work_run+0x1a0/0x280 kernel/task_work.c:116 [< inline >] exit_task_work include/linux/task_work.h:21 [] do_exit+0x183a/0x2640 kernel/exit.c:828 [] do_group_exit+0x14e/0x420 kernel/exit.c:931 [] get_signal+0x663/0x1880 kernel/signal.c:2307 [] do_signal+0xc5/0x2190 arch/x86/kernel/signal.c:807 [] exit_to_usermode_loop+0x1ea/0x2d0 arch/x86/entry/common.c:156 [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [] syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259 [] entry_SYSCALL_64_fastpath+0xc4/0xc6 Link: https://lkml.org/lkml/2017/3/6/252 Signed-off-by: Andrey Ulanov Reported-by: Dmitry Vyukov Fixes: 6209344 ("net: unix: fix inflight counting bug in garbage collector") Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 2c472af8144771a3ab4bb0e846aad83bf1bc2d16 Author: Eric Dumazet Date: Wed Mar 15 13:21:28 2017 -0700 net: properly release sk_frag.page [ Upstream commit 22a0e18eac7a9e986fec76c60fa4a2926d1291e2 ] I mistakenly added the code to release sk->sk_frag in sk_common_release() instead of sk_destruct() TCP sockets using sk->sk_allocation == GFP_ATOMIC do no call sk_common_release() at close time, thus leaking one (order-3) page. iSCSI is using such sockets. Fixes: 5640f7685831 ("net: use a per task frag allocator") Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 02b24285db6ee1e2e0143425e68396242c937d86 Author: Florian Fainelli Date: Wed Mar 15 12:57:21 2017 -0700 net: bcmgenet: Do not suspend PHY if Wake-on-LAN is enabled [ Upstream commit 5371bbf4b295eea334ed453efa286afa2c3ccff3 ] Suspending the PHY would be putting it in a low power state where it may no longer allow us to do Wake-on-LAN. Fixes: cc013fb48898 ("net: bcmgenet: correctly suspend and resume PHY device") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit fe6eaf3dbffd088a6c2230e60e86084123c01b80 Author: Linus Torvalds Date: Thu Mar 2 12:17:22 2017 -0800 give up on gcc ilog2() constant optimizations [ Upstream commit 474c90156c8dcc2fa815e6716cc9394d7930cb9c ] gcc-7 has an "optimization" pass that completely screws up, and generates the code expansion for the (impossible) case of calling ilog2() with a zero constant, even when the code gcc compiles does not actually have a zero constant. And we try to generate a compile-time error for anybody doing ilog2() on a constant where that doesn't make sense (be it zero or negative). So now gcc7 will fail the build due to our sanity checking, because it created that constant-zero case that didn't actually exist in the source code. There's a whole long discussion on the kernel mailing about how to work around this gcc bug. The gcc people themselevs have discussed their "feature" in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72785 but it's all water under the bridge, because while it looked at one point like it would be solved by the time gcc7 was released, that was not to be. So now we have to deal with this compiler braindamage. And the only simple approach seems to be to just delete the code that tries to warn about bad uses of ilog2(). So now "ilog2()" will just return 0 not just for the value 1, but for any non-positive value too. It's not like I can recall anybody having ever actually tried to use this function on any invalid value, but maybe the sanity check just meant that such code never made it out in public. Reported-by: Laura Abbott Cc: John Stultz , Cc: Thomas Gleixner Cc: Ard Biesheuvel Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit eeedc18bf39e9c71730a0df1cfc65f4d9f29bc51 Author: Jason Gunthorpe Date: Wed Nov 25 14:05:30 2015 -0700 tpm_tis: Use devm_free_irq not free_irq [ Upstream commit 727f28b8ca24a581c7bd868326b8cea1058c720a ] The interrupt is always allocated with devm_request_irq so it must always be freed with devm_free_irq. Fixes: 448e9c55c12d ("tpm_tis: verify interrupt during init") Signed-off-by: Jason Gunthorpe Acked-by: Jarkko Sakkinen Tested-by: Jarkko Sakkinen Tested-by: Martin Wilck Signed-off-by: Jarkko Sakkinen Acked-by: Peter Huewe Signed-off-by: Sasha Levin commit 729cabd84d0fac2c5f7c25b6dcef5adf828a6e73 Author: Sebastian Ott Date: Fri Apr 15 09:41:35 2016 +0200 s390/pci: fix use after free in dma_init [ Upstream commit dba599091c191d209b1499511a524ad9657c0e5a ] After a failure during registration of the dma_table (because of the function being in error state) we free its memory but don't reset the associated pointer to zero. When we then receive a notification from firmware (about the function being in error state) we'll try to walk and free the dma_table again. Fix this by resetting the dma_table pointer. In addition to that make sure that we free the iommu_bitmap when appropriate. Signed-off-by: Sebastian Ott Reviewed-by: Gerald Schaefer Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit 7f9ab5fb771ec358f6239d0c6541ebd1b266e621 Author: Thomas Huth Date: Wed May 18 21:01:20 2016 +0200 KVM: PPC: Book3S PR: Fix illegal opcode emulation [ Upstream commit 708e75a3ee750dce1072134e630d66c4e6eaf63c ] If kvmppc_handle_exit_pr() calls kvmppc_emulate_instruction() to emulate one instruction (in the BOOK3S_INTERRUPT_H_EMUL_ASSIST case), it calls kvmppc_core_queue_program() afterwards if kvmppc_emulate_instruction() returned EMULATE_FAIL, so the guest gets an program interrupt for the illegal opcode. However, the kvmppc_emulate_instruction() also tried to inject a program exception for this already, so the program interrupt gets injected twice and the return address in srr0 gets destroyed. All other callers of kvmppc_emulate_instruction() are also injecting a program interrupt, and since the callers have the right knowledge about the srr1 flags that should be used, it is the function kvmppc_emulate_instruction() that should _not_ inject program interrupts, so remove the kvmppc_core_queue_program() here. This fixes the issue discovered by Laurent Vivier with kvm-unit-tests where the logs are filled with these messages when the test tries to execute an illegal instruction: Couldn't emulate instruction 0x00000000 (op 0 xop 0) kvmppc_handle_exit_pr: emulation at 700 failed (00000000) Signed-off-by: Thomas Huth Reviewed-by: Alexander Graf Tested-by: Laurent Vivier Signed-off-by: Paul Mackerras Signed-off-by: Sasha Levin commit 33097614b509ededfc4d416b062b0d253c4cfb0b Author: Vitaly Kuznetsov Date: Sat Apr 30 19:21:35 2016 -0700 Drivers: hv: balloon: don't crash when memory is added in non-sorted order [ Upstream commit 77c0c9735bc0ba5898e637a3a20d6bcb50e3f67d ] When we iterate through all HA regions in handle_pg_range() we have an assumption that all these regions are sorted in the list and the 'start_pfn >= has->end_pfn' check is enough to find the proper region. Unfortunately it's not the case with WS2016 where host can hot-add regions in a different order. We end up modifying the wrong HA region and crashing later on pages online. Modify the check to make sure we found the region we were searching for while iterating. Fix the same check in pfn_covered() as well. Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 626143bc3937e1b00c3ef1fc31923939f1b943fc Author: Alex Hung Date: Fri May 27 15:47:06 2016 +0800 ACPI / video: skip evaluating _DOD when it does not exist [ Upstream commit e34fbbac669de0b7fb7803929d0477f35f6e2833 ] Some system supports hybrid graphics and its discrete VGA does not have any connectors and therefore has no _DOD method. Signed-off-by: Alex Hung Reviewed-by: Aaron Lu Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit e7dfb3682c515f61bc147a898098b3553533665e Author: Wang, Rui Y Date: Wed Jan 27 17:08:36 2016 +0800 crypto: mcryptd - Fix load failure [ Upstream commit ddef482420b1ba8ec45e6123a7e8d3f67b21e5e3 ] mcryptd_create_hash() fails by returning -EINVAL, causing any driver using mcryptd to fail to load. It is because it needs to set its statesize properly. Signed-off-by: Rui Wang Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 093a3219905814c9dc79495ca0b5ccc120f82deb Author: Wang, Rui Y Date: Sun Nov 29 22:45:34 2015 +0800 crypto: cryptd - Assign statesize properly [ Upstream commit 1a07834024dfca5c4bed5de8f8714306e0a11836 ] cryptd_create_hash() fails by returning -EINVAL. It is because after 8996eafdc ("crypto: ahash - ensure statesize is non-zero") all ahash drivers must have a non-zero statesize. This patch fixes the problem by properly assigning the statesize. Signed-off-by: Rui Wang Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 25f7d25dd917c78ab57cbf57017bfc74f6172e65 Author: Wang, Rui Y Date: Sun Nov 29 22:45:33 2015 +0800 crypto: ghash-clmulni - Fix load failure [ Upstream commit 3a020a723c65eb8ffa7c237faca26521a024e582 ] ghash_clmulni_intel fails to load on Linux 4.3+ with the following message: "modprobe: ERROR: could not insert 'ghash_clmulni_intel': Invalid argument" After 8996eafdc ("crypto: ahash - ensure statesize is non-zero") all ahash drivers are required to implement import()/export(), and must have a non- zero statesize. This patch has been tested with the algif_hash interface. The calculated digest values, after several rounds of import()s and export()s, match those calculated by tcrypt. Signed-off-by: Rui Wang Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 13c33d42d90d559bd3386ce9eeeda6806dee88f3 Author: Roman Mashak Date: Fri Feb 24 11:00:32 2017 -0500 net sched actions: decrement module reference count after table flush. [ Upstream commit edb9d1bff4bbe19b8ae0e71b1f38732591a9eeb2 ] When tc actions are loaded as a module and no actions have been installed, flushing them would result in actions removed from the memory, but modules reference count not being decremented, so that the modules would not be unloaded. Following is example with GACT action: % sudo modprobe act_gact % lsmod Module Size Used by act_gact 16384 0 % % sudo tc actions ls action gact % % sudo tc actions flush action gact % lsmod Module Size Used by act_gact 16384 1 % sudo tc actions flush action gact % lsmod Module Size Used by act_gact 16384 2 % sudo rmmod act_gact rmmod: ERROR: Module act_gact is in use .... After the fix: % lsmod Module Size Used by act_gact 16384 0 % % sudo tc actions add action pass index 1 % sudo tc actions add action pass index 2 % sudo tc actions add action pass index 3 % lsmod Module Size Used by act_gact 16384 3 % % sudo tc actions flush action gact % lsmod Module Size Used by act_gact 16384 0 % % sudo tc actions flush action gact % lsmod Module Size Used by act_gact 16384 0 % sudo rmmod act_gact % lsmod Module Size Used by % Fixes: f97017cdefef ("net-sched: Fix actions flushing") Signed-off-by: Roman Mashak Signed-off-by: Jamal Hadi Salim Acked-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c553b198be9cf82040bad3d760243e5a9d91630e Author: Hannes Frederic Sowa Date: Mon Mar 13 00:01:30 2017 +0100 dccp: fix memory leak during tear-down of unsuccessful connection request [ Upstream commit 72ef9c4125c7b257e3a714d62d778ab46583d6a3 ] This patch fixes a memory leak, which happens if the connection request is not fulfilled between parsing the DCCP options and handling the SYN (because e.g. the backlog is full), because we forgot to free the list of ack vectors. Reported-by: Jianwen Ji Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0925dd7dcebb20440cdc6b4be1658c75f8da19c1 Author: Jon Maxwell Date: Fri Mar 10 16:40:33 2017 +1100 dccp/tcp: fix routing redirect race [ Upstream commit 45caeaa5ac0b4b11784ac6f932c0ad4c6b67cda0 ] As Eric Dumazet pointed out this also needs to be fixed in IPv6. v2: Contains the IPv6 tcp/Ipv6 dccp patches as well. We have seen a few incidents lately where a dst_enty has been freed with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that dst_entry. If the conditions/timings are right a crash then ensues when the freed dst_entry is referenced later on. A Common crashing back trace is: #8 [] page_fault at ffffffff8163e648 [exception RIP: __tcp_ack_snd_check+74] . . #9 [] tcp_rcv_established at ffffffff81580b64 #10 [] tcp_v4_do_rcv at ffffffff8158b54a #11 [] tcp_v4_rcv at ffffffff8158cd02 #12 [] ip_local_deliver_finish at ffffffff815668f4 #13 [] ip_local_deliver at ffffffff81566bd9 #14 [] ip_rcv_finish at ffffffff8156656d #15 [] ip_rcv at ffffffff81566f06 #16 [] __netif_receive_skb_core at ffffffff8152b3a2 #17 [] __netif_receive_skb at ffffffff8152b608 #18 [] netif_receive_skb at ffffffff8152b690 #19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3] #20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3] #21 [] net_rx_action at ffffffff8152bac2 #22 [] __do_softirq at ffffffff81084b4f #23 [] call_softirq at ffffffff8164845c #24 [] do_softirq at ffffffff81016fc5 #25 [] irq_exit at ffffffff81084ee5 #26 [] do_IRQ at ffffffff81648ff8 Of course it may happen with other NIC drivers as well. It's found the freed dst_entry here: 224 static bool tcp_in_quickack_mode(struct sock *sk)↩ 225 {↩ 226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩ 227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩ 228 ↩ 229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩ 230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩ 231 }↩ But there are other backtraces attributed to the same freed dst_entry in netfilter code as well. All the vmcores showed 2 significant clues: - Remote hosts behind the default gateway had always been redirected to a different gateway. A rtable/dst_entry will be added for that host. Making more dst_entrys with lower reference counts. Making this more probable. - All vmcores showed a postitive LockDroppedIcmps value, e.g: LockDroppedIcmps 267 A closer look at the tcp_v4_err() handler revealed that do_redirect() will run regardless of whether user space has the socket locked. This can result in a race condition where the same dst_entry cached in sk->sk_dst_entry can be decremented twice for the same socket via: do_redirect()->__sk_dst_check()-> dst_release(). Which leads to the dst_entry being prematurely freed with another socket pointing to it via sk->sk_dst_cache and a subsequent crash. To fix this skip do_redirect() if usespace has the socket locked. Instead let the redirect take place later when user space does not have the socket locked. The dccp/IPv6 code is very similar in this respect, so fixing it there too. As Eric Garver pointed out the following commit now invalidates routes. Which can set the dst->obsolete flag so that ipv4_dst_check() returns null and triggers the dst_release(). Fixes: ceb3320610d6 ("ipv4: Kill routes during PMTU/redirect updates.") Cc: Eric Garver Cc: Hannes Sowa Signed-off-by: Jon Maxwell Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit eb27a77d58016382ee145d169092e4f70188b435 Author: Sabrina Dubroca Date: Mon Mar 13 13:28:09 2017 +0100 ipv6: make ECMP route replacement less greedy [ Upstream commit 67e194007be08d071294456274dd53e0a04fdf90 ] Commit 27596472473a ("ipv6: fix ECMP route replacement") introduced a loop that removes all siblings of an ECMP route that is being replaced. However, this loop doesn't stop when it has replaced siblings, and keeps removing other routes with a higher metric. We also end up triggering the WARN_ON after the loop, because after this nsiblings < 0. Instead, stop the loop when we have taken care of all routes with the same metric as the route being replaced. Reproducer: =========== #!/bin/sh ip netns add ns1 ip netns add ns2 ip -net ns1 link set lo up for x in 0 1 2 ; do ip link add veth$x netns ns2 type veth peer name eth$x netns ns1 ip -net ns1 link set eth$x up ip -net ns2 link set veth$x up done ip -net ns1 -6 r a 2000::/64 nexthop via fe80::0 dev eth0 \ nexthop via fe80::1 dev eth1 nexthop via fe80::2 dev eth2 ip -net ns1 -6 r a 2000::/64 via fe80::42 dev eth0 metric 256 ip -net ns1 -6 r a 2000::/64 via fe80::43 dev eth0 metric 2048 echo "before replace, 3 routes" ip -net ns1 -6 r | grep -v '^fe80\|^ff00' echo ip -net ns1 -6 r c 2000::/64 nexthop via fe80::4 dev eth0 \ nexthop via fe80::5 dev eth1 nexthop via fe80::6 dev eth2 echo "after replace, only 2 routes, metric 2048 is gone" ip -net ns1 -6 r | grep -v '^fe80\|^ff00' Fixes: 27596472473a ("ipv6: fix ECMP route replacement") Signed-off-by: Sabrina Dubroca Acked-by: Nicolas Dichtel Reviewed-by: Xin Long Reviewed-by: Michal Kubecek Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit cf08429dd9d5855fdac5c9bc3f15fb16e80342c1 Author: David Ahern Date: Fri Mar 10 09:46:15 2017 -0800 mpls: Send route delete notifications when router module is unloaded [ Upstream commit e37791ec1ad785b59022ae211f63a16189bacebf ] When the mpls_router module is unloaded, mpls routes are deleted but notifications are not sent to userspace leaving userspace caches out of sync. Add the call to mpls_notify_route in mpls_net_exit as routes are freed. Fixes: 0189197f44160 ("mpls: Basic routing support") Signed-off-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f987ee1b77356eacab892e3dd7e4779f745c6359 Author: Etienne Noss Date: Fri Mar 10 16:55:32 2017 +0100 act_connmark: avoid crashing on malformed nlattrs with null parms [ Upstream commit 52491c7607c5527138095edf44c53169dc1ddb82 ] tcf_connmark_init does not check in its configuration if TCA_CONNMARK_PARMS is set, resulting in a null pointer dereference when trying to access it. [501099.043007] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [501099.043039] IP: [] tcf_connmark_init+0x8b/0x180 [act_connmark] ... [501099.044334] Call Trace: [501099.044345] [] ? tcf_action_init_1+0x198/0x1b0 [501099.044363] [] ? tcf_action_init+0xb0/0x120 [501099.044380] [] ? tcf_exts_validate+0xc4/0x110 [501099.044398] [] ? u32_set_parms+0xa7/0x270 [cls_u32] [501099.044417] [] ? u32_change+0x680/0x87b [cls_u32] [501099.044436] [] ? tc_ctl_tfilter+0x4dd/0x8a0 [501099.044454] [] ? security_capable+0x41/0x60 [501099.044471] [] ? rtnetlink_rcv_msg+0xe1/0x220 [501099.044490] [] ? rtnl_newlink+0x870/0x870 [501099.044507] [] ? netlink_rcv_skb+0xa1/0xc0 [501099.044524] [] ? rtnetlink_rcv+0x24/0x30 [501099.044541] [] ? netlink_unicast+0x184/0x230 [501099.044558] [] ? netlink_sendmsg+0x2f8/0x3b0 [501099.044576] [] ? sock_sendmsg+0x30/0x40 [501099.044592] [] ? SYSC_sendto+0xd3/0x150 [501099.044608] [] ? __do_page_fault+0x2d1/0x510 [501099.044626] [] ? system_call_fast_compare_end+0xc/0x9b Fixes: 22a5dc0e5e3e ("net: sched: Introduce connmark action") Signed-off-by: Étienne Noss Signed-off-by: Victorien Molle Acked-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 32f9d631a3eaab2e81875cf4dae12cec4b7864dc Author: Dmitry V. Levin Date: Tue Mar 7 23:50:50 2017 +0300 uapi: fix linux/packet_diag.h userspace compilation error [ Upstream commit 745cb7f8a5de0805cade3de3991b7a95317c7c73 ] Replace MAX_ADDR_LEN with its numeric value to fix the following linux/packet_diag.h userspace compilation error: /usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here (not in a function) __u8 pdmc_addr[MAX_ADDR_LEN]; This is not the first case in the UAPI where the numeric value of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h already does the same: $ grep MAX_ADDR_LEN include/uapi/linux/if_link.h __u8 mac[32]; /* MAX_ADDR_LEN */ There are no UAPI headers besides these two that use MAX_ADDR_LEN. Signed-off-by: Dmitry V. Levin Acked-by: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1cad88dd2806805ee0eeb3363093d3a15f1c84c7 Author: Eric Dumazet Date: Fri Mar 3 21:01:03 2017 -0800 net: fix socket refcounting in skb_complete_tx_timestamp() [ Upstream commit 9ac25fc063751379cb77434fef9f3b088cd3e2f7 ] TX skbs do not necessarily hold a reference on skb->sk->sk_refcnt By the time TX completion happens, sk_refcnt might be already 0. sock_hold()/sock_put() would then corrupt critical state, like sk_wmem_alloc and lead to leaks or use after free. Fixes: 62bccb8cdb69 ("net-timestamp: Make the clone operation stand-alone from phy timestamping") Signed-off-by: Eric Dumazet Cc: Alexander Duyck Cc: Johannes Berg Cc: Soheil Hassas Yeganeh Cc: Willem de Bruijn Acked-by: Soheil Hassas Yeganeh Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ddecc365cf9b062b187781b4b26a212c067bea18 Author: Eric Dumazet Date: Fri Mar 3 21:01:02 2017 -0800 net: fix socket refcounting in skb_complete_wifi_ack() [ Upstream commit dd4f10722aeb10f4f582948839f066bebe44e5fb ] TX skbs do not necessarily hold a reference on skb->sk->sk_refcnt By the time TX completion happens, sk_refcnt might be already 0. sock_hold()/sock_put() would then corrupt critical state, like sk_wmem_alloc. Fixes: bf7fa551e0ce ("mac80211: Resolve sk_refcnt/sk_wmem_alloc issue in wifi ack path") Signed-off-by: Eric Dumazet Cc: Alexander Duyck Cc: Johannes Berg Cc: Soheil Hassas Yeganeh Cc: Willem de Bruijn Acked-by: Soheil Hassas Yeganeh Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0dc0c260ff713013f1d847c48ef8e583798673e2 Author: Eric Dumazet Date: Fri Mar 3 14:08:21 2017 -0800 tcp: fix various issues for sockets morphing to listen state [ Upstream commit 02b2faaf0af1d85585f6d6980e286d53612acfc2 ] Dmitry Vyukov reported a divide by 0 triggered by syzkaller, exploiting tcp_disconnect() path that was never really considered and/or used before syzkaller ;) I was not able to reproduce the bug, but it seems issues here are the three possible actions that assumed they would never trigger on a listener. 1) tcp_write_timer_handler 2) tcp_delack_timer_handler 3) MTU reduction Only IPv6 MTU reduction was properly testing TCP_CLOSE and TCP_LISTEN states from tcp_v6_mtu_reduced() Signed-off-by: Eric Dumazet Reported-by: Dmitry Vyukov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7f02456af7db872876ebf67f89e6e9ca91f4bee3 Author: Arnaldo Carvalho de Melo Date: Wed Mar 1 16:35:07 2017 -0300 dccp: Unlock sock before calling sk_free() [ Upstream commit d5afb6f9b6bb2c57bd0c05e76e12489dc0d037d9 ] The code where sk_clone() came from created a new socket and locked it, but then, on the error path didn't unlock it. This problem stayed there for a long while, till b0691c8ee7c2 ("net: Unlock sock before calling sk_free()") fixed it, but unfortunately the callers of sk_clone() (now sk_clone_locked()) were not audited and the one in dccp_create_openreq_child() remained. Now in the age of the syskaller fuzzer, this was finally uncovered, as reported by Dmitry: ---- 8< ---- I've got the following report while running syzkaller fuzzer on 86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)") [ BUG: held lock freed! ] 4.10.0+ #234 Not tainted ------------------------- syz-executor6/6898 is freeing memory ffff88006286cac0-ffff88006286d3b7, with a lock still held there! (slock-AF_INET6){+.-...}, at: [] spin_lock include/linux/spinlock.h:299 [inline] (slock-AF_INET6){+.-...}, at: [] sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504 5 locks held by syz-executor6/6898: #0: (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock include/net/sock.h:1460 [inline] #0: (sk_lock-AF_INET6){+.+.+.}, at: [] inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681 #1: (rcu_read_lock){......}, at: [] inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126 #2: (rcu_read_lock){......}, at: [] __skb_unlink include/linux/skbuff.h:1767 [inline] #2: (rcu_read_lock){......}, at: [] __skb_dequeue include/linux/skbuff.h:1783 [inline] #2: (rcu_read_lock){......}, at: [] process_backlog+0x264/0x730 net/core/dev.c:4835 #3: (rcu_read_lock){......}, at: [] ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59 #4: (slock-AF_INET6){+.-...}, at: [] spin_lock include/linux/spinlock.h:299 [inline] #4: (slock-AF_INET6){+.-...}, at: [] sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504 Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling sk_free()"). Reported-by: Dmitry Vyukov Cc: Cong Wang Cc: Eric Dumazet Cc: Gerrit Renker Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20170301153510.GE15145@kernel.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 9ebdfa6c678ccfada9530a84f959b56b54505848 Author: Alexander Potapenko Date: Wed Mar 1 12:57:20 2017 +0100 net: don't call strlen() on the user buffer in packet_bind_spkt() [ Upstream commit 540e2894f7905538740aaf122bd8e0548e1c34a4 ] KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of uninitialized memory in packet_bind_spkt(): Acked-by: Eric Dumazet ================================================================== BUG: KMSAN: use of unitialized memory CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 0000000000000000 ffff88006b6dfc08 ffffffff82559ae8 ffff88006b6dfb48 ffffffff818a7c91 ffffffff85b9c870 0000000000000092 ffffffff85b9c550 0000000000000000 0000000000000092 00000000ec400911 0000000000000002 Call Trace: [< inline >] __dump_stack lib/dump_stack.c:15 [] dump_stack+0x238/0x290 lib/dump_stack.c:51 [] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003 [] __msan_warning+0x5b/0xb0 mm/kmsan/kmsan_instr.c:424 [< inline >] strlen lib/string.c:484 [] strlcpy+0x9d/0x200 lib/string.c:144 [] packet_bind_spkt+0x144/0x230 net/packet/af_packet.c:3132 [] SYSC_bind+0x40d/0x5f0 net/socket.c:1370 [] SyS_bind+0x82/0xa0 net/socket.c:1356 [] entry_SYSCALL_64_fastpath+0x13/0x8f arch/x86/entry/entry_64.o:? chained origin: 00000000eba00911 [] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67 [< inline >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322 [< inline >] kmsan_save_stack mm/kmsan/kmsan.c:334 [] kmsan_internal_chain_origin+0x118/0x1e0 mm/kmsan/kmsan.c:527 [] __msan_set_alloca_origin4+0xc3/0x130 mm/kmsan/kmsan_instr.c:380 [] SYSC_bind+0x129/0x5f0 net/socket.c:1356 [] SyS_bind+0x82/0xa0 net/socket.c:1356 [] entry_SYSCALL_64_fastpath+0x13/0x8f arch/x86/entry/entry_64.o:? origin description: ----address@SYSC_bind (origin=00000000eb400911) ================================================================== (the line numbers are relative to 4.8-rc6, but the bug persists upstream) , when I run the following program as root: ===================================== #include #include #include #include int main() { struct sockaddr addr; memset(&addr, 0xff, sizeof(addr)); addr.sa_family = AF_PACKET; int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL)); bind(fd, &addr, sizeof(addr)); return 0; } ===================================== This happens because addr.sa_data copied from the userspace is not zero-terminated, and copying it with strlcpy() in packet_bind_spkt() results in calling strlen() on the kernel copy of that non-terminated buffer. Signed-off-by: Alexander Potapenko Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f8a586d62a8a5ae326bb3c2f4955404cbe4c3942 Author: Paul Hüber Date: Sun Feb 26 17:58:19 2017 +0100 l2tp: avoid use-after-free caused by l2tp_ip_backlog_recv [ Upstream commit 51fb60eb162ab84c5edf2ae9c63cf0b878e5547e ] l2tp_ip_backlog_recv may not return -1 if the packet gets dropped. The return value is passed up to ip_local_deliver_finish, which treats negative values as an IP protocol number for resubmission. Signed-off-by: Paul Hüber Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1e4f22fbf574f06957fcbbeaa9295755f51427c7 Author: Julian Anastasov Date: Sun Feb 26 17:14:35 2017 +0200 ipv4: mask tos for input route [ Upstream commit 6e28099d38c0e50d62c1afc054e37e573adf3d21 ] Restore the lost masking of TOS in input route code to allow ip rules to match it properly. Problem [1] noticed by Shmulik Ladkani [1] http://marc.info/?t=137331755300040&r=1&w=2 Fixes: 89aef8921bfb ("ipv4: Delete routing cache.") Signed-off-by: Julian Anastasov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ab1c72ff6d613bca13f0d65e52199a53afc50069 Author: David Forster Date: Fri Feb 24 14:20:32 2017 +0000 vti6: return GRE_KEY for vti6 [ Upstream commit 7dcdf941cdc96692ab99fd790c8cc68945514851 ] Align vti6 with vti by returning GRE_KEY flag. This enables iproute2 to display tunnel keys on "ip -6 tunnel show" Signed-off-by: David Forster Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 2d28a6be926e7e684793f2895b3c4356db239e8e Author: Matthias Schiffer Date: Thu Feb 23 17:19:41 2017 +0100 vxlan: correctly validate VXLAN ID against VXLAN_N_VID [ Upstream commit 4e37d6911f36545b286d15073f6f2222f840e81c ] The incorrect check caused an off-by-one error: the maximum VID 0xffffff was unusable. Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Signed-off-by: Matthias Schiffer Acked-by: Jiri Benc Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3f51d19d98be281113d2eb801c9021621bb75304 Author: Theodore Ts'o Date: Tue Feb 14 11:31:15 2017 -0500 ext4: don't BUG when truncating encrypted inodes on the orphan list [ Upstream commit 0d06863f903ac5f4f6efb0273079d27de3e53a28 ] Fix a BUG when the kernel tries to mount a file system constructed as follows: echo foo > foo.txt mke2fs -Fq -t ext4 -O encrypt foo.img 100 debugfs -w foo.img << EOF write foo.txt a set_inode_field a i_flags 0x80800 set_super_value s_last_orphan 12 quit EOF root@kvm-xfstests:~# mount -o loop foo.img /mnt [ 160.238770] ------------[ cut here ]------------ [ 160.240106] kernel BUG at /usr/projects/linux/ext4/fs/ext4/inode.c:3874! [ 160.240106] invalid opcode: 0000 [#1] SMP [ 160.240106] Modules linked in: [ 160.240106] CPU: 0 PID: 2547 Comm: mount Tainted: G W 4.10.0-rc3-00034-gcdd33b941b67 #227 [ 160.240106] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014 [ 160.240106] task: f4518000 task.stack: f47b6000 [ 160.240106] EIP: ext4_block_zero_page_range+0x1a7/0x2b4 [ 160.240106] EFLAGS: 00010246 CPU: 0 [ 160.240106] EAX: 00000001 EBX: f7be4b50 ECX: f47b7dc0 EDX: 00000007 [ 160.240106] ESI: f43b05a8 EDI: f43babec EBP: f47b7dd0 ESP: f47b7dac [ 160.240106] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 160.240106] CR0: 80050033 CR2: bfd85b08 CR3: 34a00680 CR4: 000006f0 [ 160.240106] Call Trace: [ 160.240106] ext4_truncate+0x1e9/0x3e5 [ 160.240106] ext4_fill_super+0x286f/0x2b1e [ 160.240106] ? set_blocksize+0x2e/0x7e [ 160.240106] mount_bdev+0x114/0x15f [ 160.240106] ext4_mount+0x15/0x17 [ 160.240106] ? ext4_calculate_overhead+0x39d/0x39d [ 160.240106] mount_fs+0x58/0x115 [ 160.240106] vfs_kern_mount+0x4b/0xae [ 160.240106] do_mount+0x671/0x8c3 [ 160.240106] ? _copy_from_user+0x70/0x83 [ 160.240106] ? strndup_user+0x31/0x46 [ 160.240106] SyS_mount+0x57/0x7b [ 160.240106] do_int80_syscall_32+0x4f/0x61 [ 160.240106] entry_INT80_32+0x2f/0x2f [ 160.240106] EIP: 0xb76b919e [ 160.240106] EFLAGS: 00000246 CPU: 0 [ 160.240106] EAX: ffffffda EBX: 08053838 ECX: 08052188 EDX: 080537e8 [ 160.240106] ESI: c0ed0000 EDI: 00000000 EBP: 080537e8 ESP: bfa13660 [ 160.240106] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 160.240106] Code: 59 8b 00 a8 01 0f 84 09 01 00 00 8b 07 66 25 00 f0 66 3d 00 80 75 61 89 f8 e8 3e e2 ff ff 84 c0 74 56 83 bf 48 02 00 00 00 75 02 <0f> 0b 81 7d e8 00 10 00 00 74 02 0f 0b 8b 43 04 8b 53 08 31 c9 [ 160.240106] EIP: ext4_block_zero_page_range+0x1a7/0x2b4 SS:ESP: 0068:f47b7dac [ 160.317241] ---[ end trace d6a773a375c810a5 ]--- The problem is that when the kernel tries to truncate an inode in ext4_truncate(), it tries to clear any on-disk data beyond i_size. Without the encryption key, it can't do that, and so it triggers a BUG. E2fsck does *not* provide this service, and in practice most file systems have their orphan list processed by e2fsck, so to avoid crashing, this patch skips this step if we don't have access to the encryption key (which is the case when processing the orphan list; in all other cases, we will have the encryption key, or the kernel wouldn't have allowed the file to be opened). An open question is whether the fact that e2fsck isn't clearing the bytes beyond i_size causing problems --- and if we've lived with it not doing it for so long, can we drop this from the kernel replay of the orphan list in all cases (not just when we don't have the key for encrypted inodes). Addresses-Google-Bug: #35209576 Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit 5363fd5fe17ac1665ff09d0ab616af1e43287b7a Author: Mikulas Patocka Date: Wed Feb 15 11:26:10 2017 -0500 dm: flush queued bios when process blocks to avoid deadlock [ Upstream commit d67a5f4b5947aba4bfe9a80a2b86079c215ca755 ] Commit df2cb6daa4 ("block: Avoid deadlocks with bio allocation by stacking drivers") created a workqueue for every bio set and code in bio_alloc_bioset() that tries to resolve some low-memory deadlocks by redirecting bios queued on current->bio_list to the workqueue if the system is low on memory. However other deadlocks (see below **) may happen, without any low memory condition, because generic_make_request is queuing bios to current->bio_list (rather than submitting them). ** the related dm-snapshot deadlock is detailed here: https://www.redhat.com/archives/dm-devel/2016-July/msg00065.html Fix this deadlock by redirecting any bios on current->bio_list to the bio_set's rescue workqueue on every schedule() call. Consequently, when the process blocks on a mutex, the bios queued on current->bio_list are dispatched to independent workqueus and they can complete without waiting for the mutex to be available. The structure blk_plug contains an entry cb_list and this list can contain arbitrary callback functions that are called when the process blocks. To implement this fix DM (ab)uses the onstack plug's cb_list interface to get its flush_current_bio_list() called at schedule() time. This fixes the snapshot deadlock - if the map method blocks, flush_current_bio_list() will be called and it redirects bios waiting on current->bio_list to appropriate workqueues. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1267650 Depends-on: df2cb6daa4 ("block: Avoid deadlocks with bio allocation by stacking drivers") Signed-off-by: Mikulas Patocka Signed-off-by: Mike Snitzer Signed-off-by: Sasha Levin commit 99472ef5839d8325058185ad56eb6d21302cee38 Author: Luis de Bethencourt Date: Mon Nov 30 14:32:17 2015 +0000 mvsas: fix misleading indentation [ Upstream commit 7789cd39274c51bf475411fe22a8ee7255082809 ] Fix a smatch warning: drivers/scsi/mvsas/mv_sas.c:740 mvs_task_prep() warn: curly braces intended? The code is correct, the indention is misleading. When the device is not ready we want to return SAS_PHY_DOWN. But current indentation makes it look like we only do so in the else branch of if (mvi_dev). Signed-off-by: Luis de Bethencourt Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 325b7e57a494ca56c0516ca3739ce8feebdc147a Author: Ralf Baechle Date: Tue Sep 20 14:33:01 2016 +0200 MIPS: DEC: Avoid la pseudo-instruction in delay slots [ Upstream commit 3021773c7c3e75e20b693931a19362681e744ea9 ] When expanding the la or dla pseudo-instruction in a delay slot the GNU assembler will complain should the pseudo-instruction expand to multiple actual instructions, since only the first of them will be in the delay slot leading to the pseudo-instruction being only partially executed if the branch is taken. Use of PTR_LA in the dec int-handler.S leads to such warnings: arch/mips/dec/int-handler.S: Assembler messages: arch/mips/dec/int-handler.S:149: Warning: macro instruction expanded into multiple instructions in a branch delay slot arch/mips/dec/int-handler.S:198: Warning: macro instruction expanded into multiple instructions in a branch delay slot Avoid this by open coding the PTR_LA macros. Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit 3930194e22c64d4f5dc4891aecd2956d6908856b Author: Arnd Bergmann Date: Mon Jan 16 14:20:54 2017 +0100 cpmac: remove hopeless #warning [ Upstream commit d43e6fb4ac4abfe4ef7c102833ed02330ad701e0 ] The #warning was present 10 years ago when the driver first got merged. As the platform is rather obsolete by now, it seems very unlikely that the warning will cause anyone to fix the code properly. kernelci.org reports the warning for every build in the meantime, so I think it's better to just turn it into a code comment to reduce noise. Signed-off-by: Arnd Bergmann Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 9ef3aa4c76aaf8e6f0a2eaf42125b9e0e048caf1 Author: Arnd Bergmann Date: Tue Jan 17 16:18:43 2017 +0100 MIPS: ralink: Remove unused rt*_wdt_reset functions [ Upstream commit 886f9c69fc68f56ddea34d3de51ac1fc2ac8dfbc ] All pointers to these functions were removed, so now they produce warnings: arch/mips/ralink/rt305x.c:92:13: error: 'rt305x_wdt_reset' defined but not used [-Werror=unused-function] This removes the functions. If we need them again, the patch can be reverted later. Fixes: f576fb6a0700 ("MIPS: ralink: cleanup the soc specific pinmux data") Signed-off-by: Arnd Bergmann Cc: John Crispin Cc: Colin Ian King Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/15044/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit 99f864ed7610a19823d59e507b10afa357027f37 Author: John Crispin Date: Tue Dec 20 19:12:46 2016 +0100 MIPS: ralink: Cosmetic change to prom_init(). [ Upstream commit 9c48568b3692f1a56cbf1935e4eea835e6b185b1 ] Over the years the code has been changed various times leading to argc/argv being defined in a different function to where we actually use the variables. Clean this up by moving them to prom_init_cmdline(). Signed-off-by: John Crispin Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14902/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit cbc32b44e8c50e777b7067f3e5a9df2c7564cde3 Author: Arnd Bergmann Date: Fri Feb 3 10:49:17 2017 +0100 mtd: pmcmsp: use kstrndup instead of kmalloc+strncpy [ Upstream commit 906b268477bc03daaa04f739844c120fe4dbc991 ] kernelci.org reports a warning for this driver, as it copies a local variable into a 'const char *' string: drivers/mtd/maps/pmcmsp-flash.c:149:30: warning: passing argument 1 of 'strncpy' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] Using kstrndup() simplifies the code and avoids the warning. Signed-off-by: Arnd Bergmann Acked-by: Marek Vasut Signed-off-by: Brian Norris Signed-off-by: Sasha Levin commit 0b9604a2c14691346459a27a2089e8226c735fea Author: Arnd Bergmann Date: Tue Jan 17 16:18:46 2017 +0100 MIPS: ip22: Fix ip28 build for modern gcc [ Upstream commit 23ca9b522383d3b9b7991d8586db30118992af4a ] kernelci reports a failure of the ip28_defconfig build after upgrading its gcc version: arch/mips/sgi-ip22/Platform:29: *** gcc doesn't support needed option -mr10k-cache-barrier=store. Stop. The problem apparently is that the -mr10k-cache-barrier=store option is now rejected for CPUs other than r10k. Explicitly including the CPU in the check fixes this and is safe because both options were introduced in gcc-4.4. Signed-off-by: Arnd Bergmann Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/15049/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit c5e13155b83cce56be3b939287c567058460218e Author: Arnd Bergmann Date: Fri Feb 3 17:43:50 2017 +0100 MIPS: ip27: Disable qlge driver in defconfig [ Upstream commit b617649468390713db1515ea79fc772d2eb897a8 ] One of the last remaining failures in kernelci.org is for a gcc bug: drivers/net/ethernet/qlogic/qlge/qlge_main.c:4819:1: error: insn does not satisfy its constraints: drivers/net/ethernet/qlogic/qlge/qlge_main.c:4819:1: internal compiler error: in extract_constrain_insn, at recog.c:2190 This is apparently broken in gcc-6 but fixed in gcc-7, and I cannot reproduce the problem here. However, it is clear that ip27_defconfig does not actually need this driver as the platform has only PCI-X but not PCIe, and the qlge adapter in turn is PCIe-only. The driver was originally enabled in 2010 along with lots of other drivers. Fixes: 59d302b342e5 ("MIPS: IP27: Make defconfig useful again.") Signed-off-by: Arnd Bergmann Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/15197/ Signed-off-by: James Hogan Signed-off-by: Sasha Levin commit ce842a00e5c4d1c4ac3e6eea1217c48f0cfb09b3 Author: Arnd Bergmann Date: Fri Feb 3 23:33:23 2017 +0100 crypto: improve gcc optimization flags for serpent and wp512 [ Upstream commit 7d6e9105026788c497f0ab32fa16c82f4ab5ff61 ] An ancient gcc bug (first reported in 2003) has apparently resurfaced on MIPS, where kernelci.org reports an overly large stack frame in the whirlpool hash algorithm: crypto/wp512.c:987:1: warning: the frame size of 1112 bytes is larger than 1024 bytes [-Wframe-larger-than=] With some testing in different configurations, I'm seeing large variations in stack frames size up to 1500 bytes for what should have around 300 bytes at most. I also checked the reference implementation, which is essentially the same code but also comes with some test and benchmarking infrastructure. It seems that recent compiler versions on at least arm, arm64 and powerpc have a partial fix for this problem, but enabling "-fsched-pressure", but even with that fix they suffer from the issue to a certain degree. Some testing on arm64 shows that the time needed to hash a given amount of data is roughly proportional to the stack frame size here, which makes sense given that the wp512 implementation is doing lots of loads for table lookups, and the problem with the overly large stack is a result of doing a lot more loads and stores for spilled registers (as seen from inspecting the object code). Disabling -fschedule-insns consistently fixes the problem for wp512, in my collection of cross-compilers, the results are consistently better or identical when comparing the stack sizes in this function, though some architectures (notable x86) have schedule-insns disabled by default. The four columns are: default: -O2 press: -O2 -fsched-pressure nopress: -O2 -fschedule-insns -fno-sched-pressure nosched: -O2 -no-schedule-insns (disables sched-pressure) default press nopress nosched alpha-linux-gcc-4.9.3 1136 848 1136 176 am33_2.0-linux-gcc-4.9.3 2100 2076 2100 2104 arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352 cris-linux-gcc-4.9.3 272 272 272 272 frv-linux-gcc-4.9.3 1128 1000 1128 280 hppa64-linux-gcc-4.9.3 1128 336 1128 184 hppa-linux-gcc-4.9.3 644 308 644 276 i386-linux-gcc-4.9.3 352 352 352 352 m32r-linux-gcc-4.9.3 720 656 720 268 microblaze-linux-gcc-4.9.3 1108 604 1108 256 mips64-linux-gcc-4.9.3 1328 592 1328 208 mips-linux-gcc-4.9.3 1096 624 1096 240 powerpc64-linux-gcc-4.9.3 1088 432 1088 160 powerpc-linux-gcc-4.9.3 1080 584 1080 224 s390-linux-gcc-4.9.3 456 456 624 360 sh3-linux-gcc-4.9.3 292 292 292 292 sparc64-linux-gcc-4.9.3 992 240 992 208 sparc-linux-gcc-4.9.3 680 592 680 312 x86_64-linux-gcc-4.9.3 224 240 272 224 xtensa-linux-gcc-4.9.3 1152 704 1152 304 aarch64-linux-gcc-7.0.0 224 224 1104 208 arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352 mips-linux-gcc-7.0.0 1120 648 1120 272 x86_64-linux-gcc-7.0.1 240 240 304 240 arm-linux-gnueabi-gcc-4.4.7 840 392 arm-linux-gnueabi-gcc-4.5.4 784 728 784 320 arm-linux-gnueabi-gcc-4.6.4 736 728 736 304 arm-linux-gnueabi-gcc-4.7.4 944 784 944 352 arm-linux-gnueabi-gcc-4.8.5 464 464 760 352 arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352 arm-linux-gnueabi-gcc-5.3.1 824 824 1064 336 arm-linux-gnueabi-gcc-6.1.1 808 808 1056 344 arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352 Trying the same test for serpent-generic, the picture is a bit different, and while -fno-schedule-insns is generally better here than the default, -fsched-pressure wins overall, so I picked that instead. default press nopress nosched alpha-linux-gcc-4.9.3 1392 864 1392 960 am33_2.0-linux-gcc-4.9.3 536 524 536 528 arm-linux-gnueabi-gcc-4.9.3 552 552 776 536 cris-linux-gcc-4.9.3 528 528 528 528 frv-linux-gcc-4.9.3 536 400 536 504 hppa64-linux-gcc-4.9.3 524 208 524 480 hppa-linux-gcc-4.9.3 768 472 768 508 i386-linux-gcc-4.9.3 564 564 564 564 m32r-linux-gcc-4.9.3 712 576 712 532 microblaze-linux-gcc-4.9.3 724 392 724 512 mips64-linux-gcc-4.9.3 720 384 720 496 mips-linux-gcc-4.9.3 728 384 728 496 powerpc64-linux-gcc-4.9.3 704 304 704 480 powerpc-linux-gcc-4.9.3 704 296 704 480 s390-linux-gcc-4.9.3 560 560 592 536 sh3-linux-gcc-4.9.3 540 540 540 540 sparc64-linux-gcc-4.9.3 544 352 544 496 sparc-linux-gcc-4.9.3 544 344 544 496 x86_64-linux-gcc-4.9.3 528 536 576 528 xtensa-linux-gcc-4.9.3 752 544 752 544 aarch64-linux-gcc-7.0.0 432 432 656 480 arm-linux-gnueabi-gcc-7.0.1 616 616 808 536 mips-linux-gcc-7.0.0 720 464 720 488 x86_64-linux-gcc-7.0.1 536 528 600 536 arm-linux-gnueabi-gcc-4.4.7 592 440 arm-linux-gnueabi-gcc-4.5.4 776 448 776 544 arm-linux-gnueabi-gcc-4.6.4 776 448 776 544 arm-linux-gnueabi-gcc-4.7.4 768 448 768 544 arm-linux-gnueabi-gcc-4.8.5 488 488 776 544 arm-linux-gnueabi-gcc-4.9.3 552 552 776 536 arm-linux-gnueabi-gcc-5.3.1 552 552 776 536 arm-linux-gnueabi-gcc-6.1.1 560 560 776 536 arm-linux-gnueabi-gcc-7.0.1 616 616 808 536 I did not do any runtime tests with serpent, so it is possible that stack frame size does not directly correlate with runtime performance here and it actually makes things worse, but it's more likely to help here, and the reduced stack frame size is probably enough reason to apply the patch, especially given that the crypto code is often used in deep call chains. Link: https://kernelci.org/build/id/58797d7559b5149efdf6c3a9/logs/ Link: http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11488 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149 Cc: Ralf Baechle Signed-off-by: Arnd Bergmann Signed-off-by: Herbert Xu Signed-off-by: Sasha Levin commit 7c2b8b923b73a9977d044f031cbcf6257e226107 Author: Arnd Bergmann Date: Mon Jan 16 12:06:09 2017 +0100 libceph: use BUG() instead of BUG_ON(1) [ Upstream commit d24cdcd3e40a6825135498e11c20c7976b9bf545 ] I ran into this compile warning, which is the result of BUG_ON(1) not always leading to the compiler treating the code path as unreachable: include/linux/ceph/osdmap.h: In function 'ceph_can_shift_osds': include/linux/ceph/osdmap.h:62:1: error: control reaches end of non-void function [-Werror=return-type] Using BUG() here avoids the warning. Signed-off-by: Arnd Bergmann Signed-off-by: Ilya Dryomov Signed-off-by: Sasha Levin commit 2c40db2389d1a839c2fd308d1295360e48822859 Author: Heiko Carstens Date: Sun Feb 5 23:03:18 2017 +0100 s390: use correct input data address for setup_randomness [ Upstream commit 4920e3cf77347d7d7373552d4839e8d832321313 ] The current implementation of setup_randomness uses the stack address and therefore the pointer to the SYSIB 3.2.2 block as input data address. Furthermore the length of the input data is the number of virtual-machine description blocks which is typically one. This means that typically a single zero byte is fed to add_device_randomness. Fix both of these and use the address of the first virtual machine description block as input data address and also use the correct length. Fixes: bcfcbb6bae64 ("s390: add system information as device randomness") Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit 5a5c7a67ca04f669586444685d3b4e43b9ad865d Author: Heiko Carstens Date: Sat Feb 4 11:40:36 2017 +0100 s390: make setup_randomness work [ Upstream commit da8fd820f389a0e29080b14c61bf5cf1d8ef5ca1 ] Commit bcfcbb6bae64 ("s390: add system information as device randomness") intended to add some virtual machine specific information to the randomness pool. Unfortunately it uses the page allocator before it is ready to use. In result the page allocator always returns NULL and the setup_randomness function never adds anything to the randomness pool. To fix this use memblock_alloc and memblock_free instead. Fixes: bcfcbb6bae64 ("s390: add system information as device randomness") Signed-off-by: Heiko Carstens Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit a12924ee8b30ffdd6de324d425ac80fff328c4da Author: Chao Peng Date: Tue Feb 21 03:50:01 2017 -0500 KVM: VMX: use correct vmcs_read/write for guest segment selector/base [ Upstream commit 96794e4ed4d758272c486e1529e431efb7045265 ] Guest segment selector is 16 bit field and guest segment base is natural width field. Fix two incorrect invocations accordingly. Without this patch, build fails when aggressive inlining is used with ICC. Cc: stable@vger.kernel.org Signed-off-by: Chao Peng Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 209fd3f3ef14e0b0a1d07d6ff27d75c49e656e84 Author: Alexander Popov Date: Tue Feb 28 19:54:40 2017 +0300 tty: n_hdlc: get rid of racy n_hdlc.tbuf [ Upstream commit 82f2341c94d270421f383641b7cd670e474db56b ] Currently N_HDLC line discipline uses a self-made singly linked list for data buffers and has n_hdlc.tbuf pointer for buffer retransmitting after an error. The commit be10eb7589337e5defbe214dae038a53dd21add8 ("tty: n_hdlc add buffer flushing") introduced racy access to n_hdlc.tbuf. After tx error concurrent flush_tx_queue() and n_hdlc_send_frames() can put one data buffer to tx_free_buf_list twice. That causes double free in n_hdlc_release(). Let's use standard kernel linked list and get rid of n_hdlc.tbuf: in case of tx error put current data buffer after the head of tx_buf_list. Signed-off-by: Alexander Popov Cc: stable Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 33455ca987a06dd2f1def30cf34607cd165966bd Author: Jiri Slaby Date: Thu Nov 26 19:28:26 2015 +0100 TTY: n_hdlc, fix lockdep false positive [ Upstream commit e9b736d88af1a143530565929390cadf036dc799 ] The class of 4 n_hdls buf locks is the same because a single function n_hdlc_buf_list_init is used to init all the locks. But since flush_tx_queue takes n_hdlc->tx_buf_list.spinlock and then calls n_hdlc_buf_put which takes n_hdlc->tx_free_buf_list.spinlock, lockdep emits a warning: ============================================= [ INFO: possible recursive locking detected ] 4.3.0-25.g91e30a7-default #1 Not tainted --------------------------------------------- a.out/1248 is trying to acquire lock: (&(&list->spinlock)->rlock){......}, at: [] n_hdlc_buf_put+0x20/0x60 [n_hdlc] but task is already holding lock: (&(&list->spinlock)->rlock){......}, at: [] n_hdlc_tty_ioctl+0x127/0x1d0 [n_hdlc] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&list->spinlock)->rlock); lock(&(&list->spinlock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by a.out/1248: #0: (&tty->ldisc_sem){++++++}, at: [] tty_ldisc_ref_wait+0x20/0x50 #1: (&(&list->spinlock)->rlock){......}, at: [] n_hdlc_tty_ioctl+0x127/0x1d0 [n_hdlc] ... Call Trace: ... [] _raw_spin_lock_irqsave+0x50/0x70 [] n_hdlc_buf_put+0x20/0x60 [n_hdlc] [] n_hdlc_tty_ioctl+0x144/0x1d0 [n_hdlc] [] tty_ioctl+0x3f1/0xe40 ... Fix it by initializing the spin_locks separately. This removes also reduntand memset of a freshly kzallocated space. Signed-off-by: Jiri Slaby Reported-by: Dmitry Vyukov Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 14d82e73c8f9288866619df7aa32745c35bfa7c5 Author: James Smart Date: Sun Feb 12 13:52:25 2017 -0800 scsi: lpfc: Correct WQ creation for pagesize [ Upstream commit 8ea73db486cda442f0671f4bc9c03a76be398a28 ] Correct WQ creation for pagesize The driver was calculating the adapter command pagesize indicator from the system pagesize. However, the buffers the driver allocates are only one size (SLI4_PAGE_SIZE), so no calculation was necessary. Signed-off-by: Dick Kennedy Signed-off-by: James Smart Reviewed-by: Hannes Reinecke Reviewed-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit aab728f85cfcce7db6d6b9d1578d8a39e33d9a86 Author: Ralf Baechle Date: Thu Dec 15 12:27:21 2016 +0100 MIPS: IP22: Reformat inline assembler code to modern standards. [ Upstream commit f9f1c8db1c37253805eaa32265e1e1af3ae7d0a4 ] Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit 1814c0e512891a6a742384faa17e6362db5c1de0 Author: Christoph Hellwig Date: Mon Feb 20 07:21:33 2017 +0100 nfsd: special case truncates some more [ Upstream commit 41f53350a0f36a7b8e31bec0d0ca907e028ab4cd ] Both the NFS protocols and the Linux VFS use a setattr operation with a bitmap of attributs to set to set various file attributes including the file size and the uid/gid. The Linux syscalls never mixes size updates with unrelated updates like the uid/gid, and some file systems like XFS and GFS2 rely on the fact that truncates might not update random other attributes, and many other file systems handle the case but do not update the different attributes in the same transaction. NFSD on the other hand passes the attributes it gets on the wire more or less directly through to the VFS, leading to updates the file systems don't expect. XFS at least has an assert on the allowed attributes, which caught an unusual NFS client setting the size and group at the same time. To handle this issue properly this switches nfsd to call vfs_truncate for size changes, and then handle all other attributes through notify_change. As a side effect this also means less boilerplace code around the size change as we can now reuse the VFS code. Signed-off-by: Christoph Hellwig Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin commit 722690069fc46fb733b35dd80cebae359c79ff09 Author: Christoph Hellwig Date: Mon Feb 20 17:04:42 2017 -0500 nfsd: minor nfsd_setattr cleanup [ Upstream commit 758e99fefe1d9230111296956335cd35995c0eaf ] Simplify exit paths, size_change use. Signed-off-by: Christoph Hellwig Cc: stable@kernel.org Signed-off-by: J. Bruce Fields Signed-off-by: Sasha Levin commit b1427077e8b8e188bece01f03908a736429e1b0f Author: Peter Rosin Date: Wed Feb 1 21:40:56 2017 +0100 iio: pressure: mpl3115: do not rely on structure field ordering [ Upstream commit 9cf6cdba586ced75c69b8314b88b2d2f5ce9b3ed ] Fixes a regression triggered by a change in the layout of struct iio_chan_spec, but the real bug is in the driver which assumed a specific structure layout in the first place. Hint: the two bits were not OR:ed together as implied by the indentation prior to this patch, there was a comma between them, which accidentally moved the ..._SCALE bit to the next structure field. That field was .info_mask_shared_by_type before the _available attributes was added by commit 51239600074b ("iio:core: add a callback to allow drivers to provide _available attributes") and .info_mask_separate_available afterwards, and the regression happened. info_mask_shared_by_type is actually a better choice than the originally intended info_mask_separate for the ..._SCALE bit since a constant is returned from mpl3115_read_raw for the scale. Using info_mask_shared_by_type also preserves the behavior from before the regression and is therefore less likely to cause other interesting side effects. The above mentioned regression causes an unintended sysfs attibute to show up that is not backed by code, in turn causing the following NULL pointer defererence to happen on access. Segmentation fault Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = ecc3c000 [00000000] *pgd=87f91831 Internal error: Oops: 80000007 [#1] SMP ARM Modules linked in: CPU: 1 PID: 1051 Comm: cat Not tainted 4.10.0-rc5-00009-gffd8858-dirty #3 Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) task: ed54ec00 task.stack: ee2bc000 PC is at 0x0 LR is at iio_read_channel_info_avail+0x40/0x280 pc : [<00000000>] lr : [] psr: a0070013 sp : ee2bdda8 ip : 00000000 fp : ee2bddf4 r10: c0a53c74 r9 : ed79f000 r8 : ee8d1018 r7 : 00001000 r6 : 00000fff r5 : ee8b9a00 r4 : ed79f000 r3 : ee2bddc4 r2 : ee2bddbc r1 : c0a86dcc r0 : ee8d1000 Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none Control: 10c5387d Table: 3cc3c04a DAC: 00000051 Process cat (pid: 1051, stack limit = 0xee2bc210) Stack: (0xee2bdda8 to 0xee2be000) dda0: ee2bddc0 00000002 c016d720 c016d394 ed54ec00 00000000 ddc0: 60070013 ed413780 00000001 edffd480 ee8b9a00 00000fff 00001000 ee8d1018 dde0: ed79f000 c0a53c74 ee2bde0c ee2bddf8 c0513c58 c06fbbe8 edffd480 edffd540 de00: ee2bde3c ee2bde10 c0293474 c0513c40 c02933e4 ee2bde60 00000001 ed413780 de20: 00000001 ed413780 00000000 edffd480 ee2bde4c ee2bde40 c0291d00 c02933f0 de40: ee2bde9c ee2bde50 c024679c c0291ce0 edffd4b0 b6e37000 00020000 ee2bdf78 de60: 00000000 00000000 ed54ec00 ed013200 00000817 c0a111fc edffd540 ed413780 de80: b6e37000 00020000 00020000 ee2bdf78 ee2bded4 ee2bdea0 c0292890 c0246604 dea0: c0117940 c016ba50 00000025 c0a111fc b6e37000 ed413780 ee2bdf78 00020000 dec0: ee2bc000 b6e37000 ee2bdf44 ee2bded8 c021d158 c0292770 c0117764 b6e36004 dee0: c0f0d7c4 ee2bdfb0 b6f89228 00021008 ee2bdfac ee2bdf00 c0101374 c0117770 df00: 00000000 00000000 ee2bc000 00000000 ee2bdf34 ee2bdf20 c016ba04 c0171080 df20: 00000000 00020000 ed413780 b6e37000 00000000 ee2bdf78 ee2bdf74 ee2bdf48 df40: c021e7a0 c021d130 c023e300 c023e280 ee2bdf74 00000000 00000000 ed413780 df60: ed413780 00020000 ee2bdfa4 ee2bdf78 c021e870 c021e71c 00000000 00000000 df80: 00020000 00020000 b6e37000 00000003 c0108084 00000000 00000000 ee2bdfa8 dfa0: c0107ee0 c021e838 00020000 00020000 00000003 b6e37000 00020000 0001a2b4 dfc0: 00020000 00020000 b6e37000 00000003 7fffe000 00000000 00000000 00020000 dfe0: 00000000 be98eb4c 0000c740 b6f1985c 60070010 00000003 00000000 00000000 Backtrace: [] (iio_read_channel_info_avail) from [] (dev_attr_show+0x24/0x50) r10:c0a53c74 r9:ed79f000 r8:ee8d1018 r7:00001000 r6:00000fff r5:ee8b9a00 r4:edffd480 [] (dev_attr_show) from [] (sysfs_kf_seq_show+0x90/0x110) r5:edffd540 r4:edffd480 [] (sysfs_kf_seq_show) from [] (kernfs_seq_show+0x2c/0x30) r10:edffd480 r9:00000000 r8:ed413780 r7:00000001 r6:ed413780 r5:00000001 r4:ee2bde60 r3:c02933e4 [] (kernfs_seq_show) from [] (seq_read+0x1a4/0x4e0) [] (seq_read) from [] (kernfs_fop_read+0x12c/0x1cc) r10:ee2bdf78 r9:00020000 r8:00020000 r7:b6e37000 r6:ed413780 r5:edffd540 r4:c0a111fc [] (kernfs_fop_read) from [] (__vfs_read+0x34/0x118) r10:b6e37000 r9:ee2bc000 r8:00020000 r7:ee2bdf78 r6:ed413780 r5:b6e37000 r4:c0a111fc [] (__vfs_read) from [] (vfs_read+0x90/0x11c) r8:ee2bdf78 r7:00000000 r6:b6e37000 r5:ed413780 r4:00020000 [] (vfs_read) from [] (SyS_read+0x44/0x90) r8:00020000 r7:ed413780 r6:ed413780 r5:00000000 r4:00000000 [] (SyS_read) from [] (ret_fast_syscall+0x0/0x1c) r10:00000000 r8:c0108084 r7:00000003 r6:b6e37000 r5:00020000 r4:00020000 Code: bad PC value ---[ end trace 9c4938ccd0389004 ]--- Fixes: cc26ad455f57 ("iio: Add Freescale MPL3115A2 pressure / temperature sensor driver") Fixes: 51239600074b ("iio:core: add a callback to allow drivers to provide _available attributes") Reported-by: Ken Lin Tested-by: Ken Lin Signed-off-by: Peter Rosin Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit 776863245d20de1ea9fd3dd484e95a50c29070a8 Author: Peter Rosin Date: Wed Feb 1 21:40:57 2017 +0100 iio: pressure: mpl115: do not rely on structure field ordering [ Upstream commit 6a6e1d56a0769795a36c0461c64bf5e5b9bbb4c0 ] Fixes a regression triggered by a change in the layout of struct iio_chan_spec, but the real bug is in the driver which assumed a specific structure layout in the first place. Hint: the three bits were not OR:ed together as implied by the indentation prior to this patch, there was a comma between the first two, which accidentally moved the ..._SCALE and ..._OFFSET bits to the next structure field. That field was .info_mask_shared_by_type before the _available attributes was added by commit 51239600074b ("iio:core: add a callback to allow drivers to provide _available attributes") and .info_mask_separate_available afterwards, and the regression happened. info_mask_shared_by_type is actually a better choice than the originally intended info_mask_separate for the ..._SCALE and ..._OFFSET bits since a constant is returned from mpl115_read_raw for the scale/offset. Using info_mask_shared_by_type also preserves the behavior from before the regression and is therefore less likely to cause other interesting side effects. The above mentioned regression causes unintended sysfs attibutes to show up that are not backed by code, in turn causing a NULL pointer defererence to happen on access. Fixes: 3017d90e8931 ("iio: Add Freescale MPL115A2 pressure / temperature sensor driver") Fixes: 51239600074b ("iio:core: add a callback to allow drivers to provide _available attributes") Signed-off-by: Peter Rosin Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit 0423dcf950cabb9613fcfc32b5966d71abe3686e Author: Theodore Ts'o Date: Sat Feb 4 23:38:06 2017 -0500 ext4: preserve the needs_recovery flag when the journal is aborted [ Upstream commit 97abd7d4b5d9c48ec15c425485f054e1c15e591b ] If the journal is aborted, the needs_recovery feature flag should not be removed. Otherwise, it's the journal might not get replayed and this could lead to more data getting lost. Signed-off-by: Theodore Ts'o Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 04a8178ad1eb8a96b4af1c93eac3a72304328996 Author: Hannes Reinecke Date: Tue Apr 26 08:06:58 2016 +0200 sd: get disk reference in sd_check_events() [ Upstream commit eb72d0bb84eee5d0dc3044fd17b75e7101dabb57 ] sd_check_events() is called asynchronously, and might race with device removal. So always take a disk reference when processing the event to avoid the device being removed while the event is processed. Signed-off-by: Hannes Reinecke Reviewed-by: Ewan D. Milne Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 6c2310c96a6f401838d41f08c3df756954f6a6d5 Author: Guennadi Liakhovetski Date: Mon Dec 12 09:16:51 2016 -0200 [media] uvcvideo: Fix a wrong macro [ Upstream commit 17c341ec0115837a610b2da15e32546e26068234 ] Don't mix up UVC_BUF_STATE_* and VB2_BUF_STATE_* codes. Fixes: 6998b6fb4b1c ("[media] uvcvideo: Use videobuf2-vmalloc") Cc: stable@vger.kernel.org Signed-off-by: Guennadi Liakhovetski Signed-off-by: Laurent Pinchart Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 7a69ea1de73b9be5d5b18d81980e5f9972a39748 Author: Maxime Jayat Date: Tue Feb 21 18:35:51 2017 +0100 net: socket: fix recvmmsg not returning error from sock_error [ Upstream commit e623a9e9dec29ae811d11f83d0074ba254aba374 ] Commit 34b88a68f26a ("net: Fix use after free in the recvmmsg exit path"), changed the exit path of recvmmsg to always return the datagrams variable and modified the error paths to set the variable to the error code returned by recvmsg if necessary. However in the case sock_error returned an error, the error code was then ignored, and recvmmsg returned 0. Change the error path of recvmmsg to correctly return the error code of sock_error. The bug was triggered by using recvmmsg on a CAN interface which was not up. Linux 4.6 and later return 0 in this case while earlier releases returned -ENETDOWN. Fixes: 34b88a68f26a ("net: Fix use after free in the recvmmsg exit path") Signed-off-by: Maxime Jayat Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c9556862a64b5ac85bfee1cfd4313615dc21d6f8 Author: David S. Miller Date: Fri Feb 17 16:19:39 2017 -0500 irda: Fix lockdep annotations in hashbin_delete(). [ Upstream commit 4c03b862b12f980456f9de92db6d508a4999b788 ] A nested lock depth was added to the hasbin_delete() code but it doesn't actually work some well and results in tons of lockdep splats. Fix the code instead to properly drop the lock around the operation and just keep peeking the head of the hashbin queue. Reported-by: Dmitry Vyukov Tested-by: Dmitry Vyukov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7babaac5d49ee7a88a5a324668dd13b575635d09 Author: Eric Dumazet Date: Tue Feb 14 09:03:51 2017 -0800 packet: fix races in fanout_add() [ Upstream commit d199fab63c11998a602205f7ee7ff7c05c97164b ] Multiple threads can call fanout_add() at the same time. We need to grab fanout_mutex earlier to avoid races that could lead to one thread freeing po->rollover that was set by another thread. Do the same in fanout_release(), for peace of mind, and to help us finding lockdep issues earlier. Fixes: dc99f600698d ("packet: Add fanout support.") Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state") Signed-off-by: Eric Dumazet Cc: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4fc8ff15e26ecfe974b7a3f386550bcfd259b8b1 Author: Eric Dumazet Date: Sun Feb 12 14:03:52 2017 -0800 net/llc: avoid BUG_ON() in skb_orphan() [ Upstream commit 8b74d439e1697110c5e5c600643e823eb1dd0762 ] It seems nobody used LLC since linux-3.12. Fortunately fuzzers like syzkaller still know how to run this code, otherwise it would be no fun. Setting skb->sk without skb->destructor leads to all kinds of bugs, we now prefer to be very strict about it. Ideally here we would use skb_set_owner() but this helper does not exist yet, only CAN seems to have a private helper for that. Fixes: 376c7311bdb6 ("net: add a temporary sanity check in skb_orphan()") Signed-off-by: Eric Dumazet Reported-by: Andrey Konovalov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit c42a8f220a83dc4460acd97e47388ffaa362c5b1 Author: Colin Ian King Date: Mon May 16 17:22:54 2016 +0100 rtc: interface: ignore expired timers when enqueuing new timers [ Upstream commit 2b2f5ff00f63847d95adad6289bd8b05f5983dd5 ] This patch fixes a RTC wakealarm issue, namely, the event fires during hibernate and is not cleared from the list, causing hwclock to block. The current enqueuing does not trigger an alarm if any expired timers already exist on the timerqueue. This can occur when a RTC wake alarm is used to wake a machine out of hibernate and the resumed state has old expired timers that have not been removed from the timer queue. This fix skips over any expired timers and triggers an alarm if there are no pending timers on the timerqueue. Note that the skipped expired timer will get reaped later on, so there is no need to clean it up immediately. The issue can be reproduced by putting a machine into hibernate and waking it with the RTC wakealarm. Running the example RTC test program from tools/testing/selftests/timers/rtctest.c after the hibernate will block indefinitely. With the fix, it no longer blocks after the hibernate resume. BugLink: http://bugs.launchpad.net/bugs/1333569 Signed-off-by: Colin Ian King Signed-off-by: Alexandre Belloni Signed-off-by: Sasha Levin commit 30f378e4208a34b767e1e7ef89644f4dd5e455d9 Author: Kent Overstreet Date: Wed Oct 26 20:31:17 2016 -0700 bcache: Make gc wakeup sane, remove set_task_state() [ Upstream commit be628be09563f8f6e81929efbd7cf3f45c344416 ] Signed-off-by: Kent Overstreet Signed-off-by: Sasha Levin commit 0f9c022c14de92601b65a475cb71c8dc2bc3d9a4 Author: Johannes Thumshirn Date: Tue Jan 31 10:16:00 2017 +0100 scsi: don't BUG_ON() empty DMA transfers [ Upstream commit fd3fc0b4d7305fa7246622dcc0dec69c42443f45 ] Don't crash the machine just because of an empty transfer. Use WARN_ON() combined with returning an error. Found by Dmitry Vyukov and syzkaller. [ Changed to "WARN_ON_ONCE()". Al has a patch that should fix the root cause, but a BUG_ON() is not acceptable in any case, and a WARN_ON() might still be a cause of excessive log spamming. NOTE! If this warning ever triggers, we may end up leaking resources, since this doesn't bother to try to clean the command up. So this WARN_ON_ONCE() triggering does imply real problems. But BUG_ON() is much worse. People really need to stop using BUG_ON() for "this shouldn't ever happen". It makes pretty much any bug worse. - Linus ] Signed-off-by: Johannes Thumshirn Reported-by: Dmitry Vyukov Cc: James Bottomley Cc: Al Viro Cc: stable@kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin