From: Dave Jones So, enough people bugged me about this to submit it for inclusion as Documentation/feature-list-2.6.txt It hasn't really seen any updates since 2.6.4 or so, so its rotted a little since then, but with it in-tree hopefully people will jump all over it, and bring it a new lease of life. Drop it into -mm for a little while and see what happens ? Signed-off-by: Andrew Morton --- 25-akpm/Documentation/feature-list-2.6.txt | 1118 +++++++++++++++++++++++++++++ 1 files changed, 1118 insertions(+) diff -puN /dev/null Documentation/feature-list-2.6.txt --- /dev/null Thu Apr 11 07:25:15 2002 +++ 25-akpm/Documentation/feature-list-2.6.txt Thu Jan 13 15:18:56 2005 @@ -0,0 +1,1118 @@ + The post-halloween document. + (aka, 2.6 - what to expect) + +Original author: Dave Jones , with help +from many contributors. + +This document explains some of the new functionality to be found in the 2.6 +Linux kernel, some pitfalls you may encounter, and also points out some new +features which could really use testing. +Note, that "contact foo@bar.com" below also implies that you should also +cc: linux-kernel@vger.kernel.org. + +Latest version of this document can always be found in +Documentation/feature-list-2.6.txt + +Translations: +Spanish - http://www.terra.es/personal/diegocg/post-halloween-2.6.es.txt +German - http://www.kubieziel.de/computer/halloween-german.html +Polish - http://soltysiak.com/linux/post-halloween-2.6.pl.html +pt_BR - http://www.maluco.com.br/docs/post-halloween-2.6-pt_BR.txt + http://people.nl.linux.org/~thiago/docs/post-halloween-2.6-pt_BR.txt + +Applying patches. +~~~~~~~~~~~~~~~~~ +- In 2.4 and previous kernels, the recommended way to apply patches was + to use a command line such as ... + gzip -cd patchXX.gz | patch -p0 + In 2.6, Linus started adding an extra path element to the diffs, + so using -p1 in the untarred 'to be patched' directory is necessary. + +Known gotchas. +~~~~~~~~~~~~~~ +Certain known bugs are being reported over and over. Here are the +workarounds. +- Blank screen after decompressing kernel? + Make sure your .config has + CONFIG_INPUT=y + CONFIG_VT=y + CONFIG_VGA_CONSOLE=y + CONFIG_VT_CONSOLE=y + A lot of people have discovered that taking their .config from 2.4 and + running make oldconfig to pick up new options leads to problems, notably + with CONFIG_VT not being set. +- An additional bug biting some people is that NICs fail to receive packets + (usually notable by a NIC not getting a DHCP lease for eg, despite being + sent one by the server). Booting with "noapic" "acpi=off" or a combination + of both fixes this for most people. +- (Possibly linked to above bug) VIA APIC routing is currently broken. + boot with 'noapic'. +- Can't load any modules? You need updated tools (See modules section below). +- depmod reports Unresolved symbols? depmod from modutils instead of + depmod from module-init-tools is first in $PATH (might be different + $PATHs as $USER and $ROOT) +- The boot command line argument mem= changed slightly in 2.6 + Arguments such as mem=exactmap mem=640k@0 mem=200M@1M + should now use memmap= instead. + +Regressions. +~~~~~~~~~~~~ +(Things not expected to work just yet) +- The hptraid/promise drivers for proprietary RAID formats are currently + non functional, and will probably be converted to use device-mapper. +- Some filesystems still need work (Intermezzo, UFS, HFS, HPFS..) + - UMSDOS fs is currently missing, pending rewrite. + - EFS (has a blocksize problem, depending on the device that the + filesystem is being mounted on) +- A number of drivers don't compile currently due to them needing various + work to convert them to the new APIs +- The format of /proc/stat changed, which could break some + applications that still depend on the old layout. +- Some people seem to have trouble running rpm, most notably Red Hat 9 users. + This is a known bug of rpm. + Workaround: run "export LD_ASSUME_KERNEL=2.2.5", before running rpm. + This is thought to be a bug related to db4 and O_DIRECT interaction. + +Removed features. +~~~~~~~~~~~~~~~~~ +- khttpd is gone. +- Older Direct Rendering Manager (DRM) support (For XFree86 4.0) + has been removed. Upgrade to XFree86 4.1.0 or higher. +- LVM1 has been removed. See Device-mapper below. +- The system call table is no longer exported. Any module that relied + on this previously will no longer work. +- Soundmodem hamradio support has been removed. Its functionality + has been superceded by a userspace replacement. + http://www.baycom.org/~tom/ham/soundmodem +- Direct booting from floppy is no longer supported. + You should now use a boot loader program such as syslinux instead. + "make bzdisk" continues to work (now using syslinux). +- Callout tty devices (/dev/cua) have been deprecated since 2.1.90pre2. + Support is now removed. + + +Deprecated/obsolete features. +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- devfs will be obsoleted in favour of udev (http://www.kernel.org/pub/linux/utils/kernel/hotplug/) +- boot time root= parsing changed. + ramdisks are now ram instead of rd and cm206 is cm206cd (instead of + cm206). + Additionally, cciss driver needs the numeric device ID passed instead of + the device name. +- usbdevfs will be going away in 2.7. The same filesystem can + be mounted as 'usbfs' in recent 2.4 kernels, and in 2.5.52 + and above, which is what the filesystem will furthermore be + known as. +- elvtune is deprecated (as are the ioctl's it used). + Instead, the io scheduler tunables are exported in sysfs (see below) + in the /sys/block//queue/iosched directory. + Jens wrote a document explaining the tunables of the new scheduler at + http://www.lib.uaa.alaska.edu/linux-kernel/archive/2002-Week-44/att-deadline-iosched.txt +- Using sysctls by numeric values is deprecated, and will go away + in the next development series. + + +Modules. +~~~~~~~~ +- The in-kernel module loader was reimplemented. +- You need replacement module utilities from + http://www.kernel.org/pub/linux/kernel/people/rusty/modules/ +- A backwards compatible set of module utilities is also available + from the same URL in RPM format. +- Debian sarge/sid or Conectiva snapshot users can just use + 'apt-get install module-init-tools' +- Modules now free stuff marked with __init or __initdata. +- For Red Hat users, there's another pitfall in "/etc/rc.sysinit". + During startup, the script sets up the binary used to dynamically load + modules stored at "/proc/sys/kernel/modprobe". The initscript looks + for "/proc/ksyms", but since it doesn't exist in 2.6 kernels, the + binary used is "/sbin/true" instead. + + This, eventually, will keep modules from working. Red Hat users will + have to patch the "/etc/rc.sysinit" script to set + "/proc/sys/kernel/modprobe" to "/sbin/modprobe", even + when "/proc/ksyms" doesn't exist. +- Modules now have a .ko suffix instead of .o +- Some (older) versions of 'mkinitrd' don't search for modules + that end with .ko, so update your mkinitrd if this is a problem. + + +Kernel build system. +~~~~~~~~~~~~~~~~~~~~ +- The build system is much improved compared to 2.4. + You should notice quicker builds, and less spontaneous rebuilds + of files on subsequent builds from already built trees. +- There are new graphical config tools. + "make xconfig" now requires the qt libraries. + "make gconfig" uses gtk libraries. +- Make menuconfig/oldconfig has no user-visible changes other than speed, + whilst numerous improvements have been made. +- Several new debug targets exist: 'allyesconfig' 'allnoconfig' 'allmodconfig'. +- Note: The new configuration system is not CML2 related. +- Also note: Whilst some ideas were taken from it, Keith Owens' + kbuild-2.5 project was not integrated. +- "make" is now the preferred command, without a target; it does + and modules. +- "make -jN" is now the preferred parallel-make execution. + Do not bother to provide "MAKE=xxx" +- The build is now much less verbose. If you want to see exactly what's + going on, try "make V=1" or set KBUILD_VERBOSE=1 in your environment. +- 'make kernel/mm.o' will build the named file, provided a + corresponding source exists. This also works for (non-composite) + modules. (FIXME: broken for modules right now?) +- 'make kernel/' will compile all files in a subdirectory and below. +- There is no need to run 'make dep' at any stage. +- 'make help' provides a list of typical targets, including debugging targets. +- You can now build in a separate tree from the source by doing + make O=builddir + + +IO subsystem. +~~~~~~~~~~~~~ +- You should notice considerable throughput improvements over 2.4 due + to much reworking of the block and the memory management layers. +- Report any regressions in this area to Jens Axboe + and Andrew Morton +- Two different IO elevators are available. The default is the + anticipatory IO scheduler. You can select the deadline scheduler by + booting with "elevator=deadline" on the kernel command line. +- For some workloads the anticipatory scheduler is around 10% slower + than deadline. Most notably, database workloads which seek all over the + disk performing reads and synchronous writes. Database folks will likely + want to boot with elevator=deadline to get that last bit of performance back. +- Assorted changes throughout the block layer meant various block + device drivers had a large scale cleanup whilst being updated to + newer APIs. +- The size and alignment of O_DIRECT file IO requests now matches that + of the device, not the filesystem. Typically this means that you + can perform O_DIRECT IO with 512-byte granularity rather than 4k. + But if you rely upon this, your application will not work on 2.4 kernels + and will not work on some devices. + + +block device size support. +~~~~~~~~~~~~~~~~~~~~~~~~~~ +- Thanks to work done by Peter Chubb, block devices can now access up to + 16TB on 32-bit architectures, and up to 8EB on 64-bit architectures. +- To use the new BLKGETSZ64 ioctls, you'll need updated file-utils. + (Currently only jfsutils 1.0.20 has this change, patches for other + filesystems are still pending merging) +- The old 'struct statfs' is not able to describe large devices - the + statfs() system call will now return -EOVERFLOW for such devices. + A new system call called statfs64() with a new structure has been added + to support large devices. It it unknown at time of writing how many + userspace utilities have been converted to take advantage of this + syscall when available. + + +POSIX ACLs & Extended attributes. +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- Userspace tools available at http://acl.bestbits.at + + +VM Changes. +~~~~~~~~~~~ +- Version zero swap partitions are no longer supported (everything is + using v1 now anyway - rerun mkswap if in doubt). + Linux 2.0.x requires v0 swap space, Linux v2.1.117 and later + support v1. mkswap(8) can format swap space in either format. +- The actual 'reverse mappings' part of Rik van Riel's rmap vm was merged. + VM behaviour under certain loads should improve. +- VM misbehaviour should be reported to linux-mm@kvack.org +- The VM explicitly checks for sparse swapfiles, and aborts if one is found. +- /proc/sys/vm/swappiness defines the kernel's preference for pagecache over + mapped memory. Setting it to 100 (percent) makes it treat both types of + memory equally. Setting it to zero makes the kernel very much prefer to + reclaim plain pagecache rather than mapped-into-pagetables memory. +- The bdflush() syscall is now officially deprecated. The syscall + does nothing, and prints a stern warning to users. The functionality + is replaced by the pdflush daemons. +- Due to various changes, swap files should be just as fast as swap partitions. +- In 2.4, up to 64 swap files were possible. In 2.6, this number is reduced + to 32. Like 2.4, these files can be up to 64GB in size, though you will + need a recent util-linux to have a mkswap utility that supports >2GB + + +Kernel preemption. +~~~~~~~~~~~~~~~~~~ +- The much talked about preemption patches made it into 2.6. + With this included you should notice much lower latencies especially + in demanding multimedia applications. +- Note, there are still cases where preemption must be temporarily disabled + where we do not. These areas occur in places where per-CPU data is used. +- If you get "xxx exited with preempt count=n" messages in syslog, + don't panic, these are non fatal, but are somewhat unclean. + (Something is taking a lock, and exiting without unlocking) +- If you DO notice high latency with kernel preemption enabled in + a specific code path, please report that to Andrew Morton + and Robert Love . + The report should be something like "the latency in my xyz application + hits xxx ms when I do foo but is normally yyy" where foo is an action + like "unlink a huge directory tree". + + +Process scheduler improvements. +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- Another much talked about feature. Ingo Molnar reworked the process + scheduler to use an O(1) algorithm. In operation, you should notice + no changes with low loads, and increased scalability with large numbers + of processes, especially on large SMP systems. +- Scheduler is now Hyperthreading SMP aware and will disperse processes + over physically different CPUs, instead of just over logical CPUs. +- Robert Love wrote various utilities for changing behaviour of the + scheduler (binding processes to CPUs etc). You can find these tools at + http://tech9.net/rml/schedutils +- The behavior of sched_yield() changed a lot. A task that uses + this system call should now expect to sleep for possibly a very + long time. Tasks that do not really desire to give up the + processor for a while should probably not make heavy use of this + function. Unfortunately, some GUI programs (like Open Office) + do make excessive use of this call and under load their + performance is poor. It seems this new 2.6 behavior is optimal + but some user-space applications may need fixing. +- The above applies to use of yield() in the kernel, too. +- 2.6 adds system calls for manipulating a task's processor + affinity: sched_getaffinity() and sched_setaffinity() +- Regressions to mingo@redhat.com and rml@tech9.net +- Debian users who encounter effects such as skips in mp3 + playback, jerky mouse movement may want to stop the + X server from renicing itself to -10 + You can alter this permanently with 'dpkg-reconfigure xserver-common'; + if you elect not to have /etc/X11/Xwrapper.config managed by debconf, + simply edit it directly. +- Balancing of IRQs between multiple CPUs should be handled using the + irqbalance (http://people.redhat.com/arjanv/irqbalance/) program. +- David Mosberger maintains a webpage containing some current 'known gotchas' + of the O(1) scheduler at http://www.hpl.hp.com/research/linux/kernel/o1.php + + +PCI. +~~~~ +- PCI domain support has been added. For most people, this just means that + all PCI slot names are extended with "0000:" on the front, but for people + with bigger servers it means they're able to access all their PCI devices. +- More hotplug drivers have been added, including a fake PCI hotplug driver + so people without specialised hardware can test hotplug features. + +Random. +~~~~~~~ +- /dev/hwrandom got support for some new hardware (now also backported to 2.4) + such as the HW RNG on newer VIA Cyrix CPUs. +- rng-tools can be found at http://sourceforge.net/projects/gkernel + + +Fast userspace mutexes (Futexes). +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- Rusty Russell added functionality that allows userspace to have + fast mutexes that only use syscalls when there is contention. Used by + NPTL. +- Additional information on futexes can be found in Ulrich Dreppers + paper on the subject at http://people.redhat.com/drepper/futex.pdf +- Bert Hubert has written some documentation on this functionality + at http://ds9a.nl/futex-manpages + + +epoll +~~~~~ +Davide Libenzi wrote an event based poll replacement that got +included in 2.6. More info available at +http://www.xmailserver.org/linux-patches/nio-improve.html +http://lwn.net/Articles/13587/ + + +Threading improvements. +~~~~~~~~~~~~~~~~~~~~~~~ +- Ingo Molnar put a lot of work into threading improvements for 2.6. + Some of the features of this work are: + - Generic pid allocator (arbitrary number of PIDs with no slowdown, + unified pidhash). + - Thread Local Storage syscalls + - sys_clone() enhancements (CLONE_SETTLS, CLONE_PARENT_SETTID, CLONE_SETTID, + CLONE_CLEARTID, CLONE_DETACHED) + - POSIX thread signals stuff (atomic signals, shared signals, etc.) + - Per-CPU GDT + - Threaded coredumping support + - sys_exit() speedups (O(1) exit) + - Generic, improved futexes, vcache + - New, threading related ptrace features + - exit/fork task cache + - /proc updates for threading + - API changes for threading. +- Users should notice a significant speedup in basic thread operations. + This is true to a lesser extent even for old-threading userspace libraries + such as LinuxThreads. +- Regressions should go to Ingo Molnar and + phil-list@redhat.com. Regressions could happen in the area of signal + handling and related threading semantics, plus coredumping. +- Native Posix Threading Library (NPTL). + Ulrich Drepper worked closely with Ingo on the threading enhancements, and + developed a 1:1 model threading library. You can find out more about NPTL at + http://people.redhat.com/drepper/nptl-design.pdf + + +Enhanced coredumping. +~~~~~~~~~~~~~~~~~~~~~ +- 2.6 offers you the ability to configure the way core files are + named through a /proc/sys/kernel/core_pattern file. + You can use various format identifiers in this name to affect + how the core dump is named. + + %p - insert pid into filename + %u - insert current uid into filename + %g - insert current gid into filename + %s - insert signal that caused the coredump into the filename + %t - insert UNIX time that the coredump occurred into filename + %h - insert hostname where the coredump happened into filename + %e - insert coredumping executable name into filename + + You should ensure that the string does not exceed 64 bytes. +- Multithreaded processes can now dump core + + +Input layer. +~~~~~~~~~~~~ +- Possibly the most visible change to the end user. If misconfigured, + you'll find that your keyboard/mouse/other input device will no longer work. + 2.6 offers a much more flexible interface to devices such as keyboards. +- The downside is more confusing options. + In the "Input device support" menu, be sure to enable at least the following. + + --- Input I/O drivers + < > Serial i/o support + < > i8042 PC Keyboard controller + [ ] Keyboards + [ ] Mice + + (Also choose the relevant keyboard/mouse from the list) + +- If you find your keyboard/mouse still don't work, edit the file + drivers/input/serio/i8042.c, and replace the #undef DEBUG + with a #define DEBUG, recompile and reinstall. + + When you boot, you should now see a lot more debugging information. + Forward this information to Vojtech Pavlik + +- If you use a KVM switcher, and experience problems, booting with the boot + time argument 'psmouse_noext' should fix your problems. +- Users of multimedia keys without X will see changes in how the kernel + handles those keys. People who customize keymaps or keycodes in 2.4 + may need to make some changes in 2.6 +- Users wanting support for the PC speaker need to enable CONFIG_INPUT_PCSPKR, + or you won't get a single beep. +- Synaptics touchpad users may be interested to check out + http://w1.894.telia.com/~u89404340/touchpad/ +- In 2.4 users of Japanese keyboards were able to type '|' or + '\' characters without loading any custom keymap on the + console. With the keymap in 2.6, this is not possible + anymore. People with these keyboards have to load a keymap + with loadkeys rebuilt from the source, since loadkeys in some + vendor distributions cannot load keycodes larger than 127. + There is a patch to fix this, but it has not been integrated + (http://tinyurl.com/t75a). +- A FAQ on common problems with the new input layer is available + at http://lwn.net/Articles/69107/ + + +PnP layer. +~~~~~~~~~~ +- Support for plug and play devices such as early ISAPnP cards has improved a + lot in the 2.6 kernel. The new code behaves more closely to the code + handling PCI devices (probe, remove etc callbacks), and also merges + PnP BIOS access code. +- Report any regressions in plug & play functionality to + Adam Belay + + +ALSA. +~~~~~ +- The advanced linux sound architecture was merged into 2.6. + This offers considerably improved functionality over the older OSS drivers, + but requires new userspace tools. +- Several distros have shipped ALSA for some time, so you may already have the + necessary tools. If not, you can find them at http://www.alsa-project.org/ +- ALSA can emulate OSS interface using the snd_pcm_oss/snd_pcm_mixer + modules, if your card produces nothing but silence, you may need to run + alsamixer to unmute channels wich /dev/mixer doesn't see +- Note that the OSS drivers are also still functional, and still present. + Many features/fixes that went into 2.4 are still not applied to these + drivers, and it's still unclear if they will remain when 2.6 ships. + The long term goal is to get everyone moved over to (the superior) ALSA. + + +AGP. +~~~~ +- The agpgart driver got a long overdue cleanup which involved + splitting it into an agpgart core, and per-chipset drivers. + You may need to adjust your modules configuration to autoload + the chipset drivers on loading the agpgart module. +- Generic AGP 3.0 support is now included. + +DRI. +~~~~ +- Direct rendering in 2.6 hasn't had much (if any?) testing on + older versions of XFree86. Feedback on whether 4.1 works would + be useful. + + +Faster system calls. +~~~~~~~~~~~~~~~~~~~~ +- Systems that support the SYSENTER extension (Basically Intel Pentium-II + and above, and AMD Athlons) now have a faster method of making the + transition from userspace to kernelspace when a syscall is performed. +- Pentium Pro also has SYSENTER, but due to errata, is unusable. +- Without an updated glibc, this will not be noticable. +- VMWare 4 users may get crashes due to this. + Zwane Mwaikambo wrote a patch for a "nosysenter" option which is worth + googling for if there isn't a vmware update available. +- Regressions to torvalds@osdl.org and libc-alpha@sources.redhat.com + + +procps. +~~~~~~~ +- The 2.6 /proc filesystems changed some statistics, which confuse older + versions of procps. Rik van Riel and Robert Love have been maintaining a + version of procps during the development of 2.6 which tracks changes to + /proc which you can find at http://tech9.net/rml/procps/ +- Alternatively, the procps by Albert Cahalan now supports the altered formats + since v3.0.5 -- http://procps.sf.net/ +- The /proc/meminfo format changed slightly which also broke gtop in strange + ways. Likely this also broke some of the KDE/GNOME panel applets. + + +Framebuffer layer. +~~~~~~~~~~~~~~~~~~ +- James Simmons has reworked the framebuffer/console layer considerably for + 2.6. Support for some cards is still lagging a little, but it should be + functionally no different than previous incarnations. +- boot time arguments may have changed depending on your driver. + an example of the change is.. + append = "video=radeon:1024x768-24@100" + needs to become.. + append = "video=radeonfb:1024x768-24@100" +- Current userspace tools (fbset for eg) are not yet updated, + and won't function as expected. +- The VESA framebuffer now enables MTRRs for the framebuffer memory range during + initialisation (Note: PCI cards only). + If you notice screen corruption, please report this, along with an lspci output, + so your card can be blacklisted. +- Any problems should go to + + +IDE. +~~~~ +- The IDE code rewrite was subject to much criticism in early 2.5.x, which + put off a lot of people from testing. This work was then subsequently + dropped, and reverted back to a 2.4.18 IDE status. + Since then additional work has occurred, but not to the extent + of the first cleanup attempts. +- Known problems with the current IDE code. + o Simplex IDE devices (eg Ali15x3) are missing DMA sometimes + o Most PCMCIA devices have unload races and may oops on eject + o Modular IDE does not yet work, modular IDE PCI modules sometimes + oops on loading + o ide-scsi is completely broken in 2.6 currently. Known problem. + If you need it either use 2.4 or fix it 8) +- IDE disk geometry translators like OnTrack, EZ Partition, Disk Manager + are no longer autodetected. The only way forward is to remove the translator + from the drive, and start over, or use boot parameters depending on the + type of remapper used :- + hdx=remap63 - add 63 to each sector (For OnTrack DM) + hdx=remap - remap 0->1 (For EZDrive) +- See also the CD Recording section for some important changes + related to IDE CD writers. + +IDE TCQ. +~~~~~~~~ +- Tagged command queueing for IDE devices has been included. +- Not all combinations of controllers & devices may like this, + so handle with care. + READ AS: ** Don't use IDE TCQ on any data you value. + It's likely bad combinations will be blacklisted as and when discovered. + +- If you didn't choose the "TCQ on by default" option, you can enable + it by using the command + + echo "using_tcq:32" > /proc/ide/hdX/settings + + (replacing 32 with 0 disables TCQ again). + +- Report success/failure stories to Jens Axboe with + inclusion of hdparm -i /dev/hdX, and lspci output. + + +SCSI. +~~~~~ +- Various SCSI drivers still need work, and don't even compile. +- Various drivers currently lack error handling. + These drivers will cause warnings during compilation due to + missing abort: & reset: functions. +- Note, that some drivers have had these members removed, but still + lack error handling. Those noticed so far are ncr53c8xxx, sym53c8xx +- large dev_t support allowing thousands of disks to be + supported (was 128 or 256 in the 2.4 series) +- major code cleanup, initially to support the block layer (bio) + improvements have led to: + - better throughput (?) [less double handling of data] + - per HBA locks (there was a single io_request_lock in + the 2.4 series) + - more flexible interface to HBA drivers + - better hotplug support, especially for USB mass storage + and ieee1394 sbp2 devices [well it's work_in_progress] +- improved error processing and scanning code (support for + large, sparse lun spaces) +- lots of scsi driver internals available via sysfs + + +v4l2. +~~~~~ +- The video4linux API finally got its long awaited cleanup. +- xawtv, bttv and most other existing v4l tools are also compatible + with the new v4l2 layer. You should notice no loss in functionality. +- See http://bytesex.org/v4l/ for more information. + + +Quota reworking. +~~~~~~~~~~~~~~~~ +The new quota system needs new tools. Supports 32 bit uids. +http://www.sf.net/projects/linuxquota/ + + +CD Recording. +~~~~~~~~~~~~~ +- Jens Axboe added the ability to use DMA for writing CDs on + ATAPI devices. Writing CDs should be much faster than it + was in 2.4, and also less prone to buffer underruns and the like. +- With a recent cdrecord, you also no longer need ide-scsi in order to use + an IDE CD writer. +- Ripping audio tracks off of CDs now also uses DMA and should be + notably faster. You can also find an updated cdda2wav at: + *.kernel.org/pub/linux/kernel/people/axboe/tools/ +- Send good/bad reports of audio extraction with cdda2wav and burning with + the cdrecord to Jens Axboe +- Currently only 'open by device name' works in cdrecord. + cdrecord -dev=/dev/hdX -inq +- More info at http://lwn.net/Articles/13538/ & http://lwn.net/Articles/13160/ + + +USB: +~~~~ +- USB host controller drivers were renamed in 2.6. They are now + uhci-hcd for UHCI controllers. + ohci-hcd for OHCI controllers. + ehci-hcd for EHCI (USB 2.0) controllers. +- Very little user visible changes, the only noticable 'major' change + is that there is now only one UHCI driver. As noted elsewhere, usbdevfs + was renamed to usbfs. +- USB-storage has changed behaviour. A device which is disconnected and + then reconnected is not reassociated with the old /dev node. +- USB storage also got several performance enhancements. + +- USB 'gadget' support. + There's a new "USB Gadget" API supporting USB devices that + run Linux inside. Examples include PDAs, cable modems, + and some printers. That API is how the driver for the + USB Device Controller (UDC) hardware talks with portable + "gadget drivers". A gadget driver is what makes that + hardware act like a "network link" or a "printer". + + When you don't want to write a gadget driver in the kernel, + then "gadgetfs" lets you do it in user mode programs. + Each endpoint appears as a single file, so it's a lot + simpler than "usbfs". Currently it's purely synchronous, + but it should be natural for someone to add AIO support. + + See http://www.linux-usb.org/gadget for more information + about this API framework, including a pthreaded example + "gadgetfs" program. See the 2.6 kerneldoc for API info. + + +Nanosecond stat: +~~~~~~~~~~~~~~~~ +The stat64() syscall was changed to return jiffies granularity. +This allows make(1) to make better decisions on whether or not it +needs to recompile a file. Not all filesystems may support such precision. + + +Filesystems: +~~~~~~~~~~~~ +A number of additional filesystems have made their way into 2.6. +Currently it supports: ext2, ext3, reiserfs, jfs, xfs, minix, romfs, +iso9660, udf, msdos, vfat, ntfs (ro), adfs, amiga ffs, apple macintosh hfs, +BeOS befs (ro), bfs, efs (ro), cramfs, free vxfs, os/2 hpfs, qnx4fs, +sysvfs, ufs. +Whilst these have had testing out of tree, the level of testing +after merging is unparalleled. Be wary of trusting data to immature +filesystems. A number of new features and improvements have also +been made to the existing filesystems from 2.4. + +Reports of stress testing with the various tools available would +be beneficial. + + +Generic VFS changes. +~~~~~~~~~~~~~~~~~~~~ +- Since Linux 2.5.1 it is possible to atomically move a subtree to + another place. The usage is... + mount --move olddir newdir +- Since 2.5.43, dmask=value sets the umask applied to directories only. + The default is the umask of the current process. + The fmask=value sets the umask applied to regular files only. + Again, the default is the umask of the current process. +- Directories can now be marked as synchronous using chattr +S, + so that all changes will be immediately written to disk. + Note, this does not guarantee atomicity, at least not for all filesystems + and for all operations. You *can* be guaranteed that system calls will + not return until the changes are on disk; note though that this does have + has some significant performance impacts. + + + +devfs. +~~~~~~ +- devfs was somewhat stripped down and a lot of duplicate functionality + was removed. You now need to enable CONFIG_DEVPTS_FS=y and mount + the devpts filesystem in the same manner you would if you were not + using devfs. + + +EXT2. +~~~~~ +- 2.5.49 included an extension to ext2 which will cause it to not attach + buffer_head structures to file or directory pagecache at all, ever. + This is for the big highmem machines. It is enabled via the `-o nobh' + mount option. +- The ext2 filesystem is now using finer-grained locking which yields reduced + context switch rates and higher throughput on large SMP machines. + + +EXT3. +~~~~~ +- The ext3 filesystem has gained indexed directory support, which offers + considerable performance gains when used on filesystems with directories + containing large numbers of files. +- In order to use the htree feature, you need at least version 1.32 of + e2fsprogs. +- Existing filesystems can be converted using the command + + tune2fs -O dir_index /dev/hdXXX + +- The latest e2fsprogs can be found at + http://prdownloads.sourceforge.net/e2fsprogs +- The ext2 and ext3 filesystems have new file allocations policies (the "Orlov + allocator") which will place subdirectories closer together on-disk. This + tends to mean that operations which touch many files in a directory tree are + much faster if that tree was created under a 2.6 kernel. + +Reiserfs. +~~~~~~~~~ +- Reiserfs now supports inode attributes such as immutable. + (Also included in 2.4.17, so not really 'new'). +- Relocated/non-standard size journal support (also backported + to 2.4.22pre3) +- Support for writes larger than 4KB in size, which means speedups + on large file writes, esp in append mode, should also be more + SMP friendly. +- Variable blocksize support. (Ie, you can choose any blocksize + in the range of 1024 .. PAGE_CACHE_SIZE, must be power of 2). +- A bug in kmail was triggered by some optimisations in reiserfs in 2.6 + Upgrading kmail should fix this, or mounting the reiserfs partition + with 'nolargeio=1' + + +NFS. +~~~~ +- Basic support has been added for NFSv4 (server and client) +- Additionally, kNFSD now supports transport over TCP. + This experimental feature is also backported to 2.4.20 +- Interoperability reports with other OS's would be useful. +- v1.0.3 of nfs-utils supports the newer 2.6 kernels change + of kdev_t type. You can grab it at http://nfs.sourceforge.net +- Problems to nfs@lists.sourceforge.net + + +NTFS. +~~~~~ +- A new, rewritten NTFS driver was merged for 2.6. It has the + following main benefits over the old driver: + - SMP and reentrant safe + - support bigger than 4 kB cluster sizes + - full support for sparse files on W2K/XP/W2K3 + - mmap() support + - More stable, and much faster than the previous NTFS driver. + - Still read-only, but with safe file overwrite support without changes + to the file size + - More information is available at http://linux-ntfs.sf.net + + +sysfs. +~~~~~~ +In simple terms, the sysfs filesystem is a saner way for +drivers to export their innards than /proc. +This filesystem is always compiled in, and can be mounted +just like another virtual filesystem. No userspace tools +beyond cat(1) and echo(1) are needed. tree(1) is also good for +viewing its overall structure. + + mount -t sysfs none /sys + +See Documentation/filesystems/sysfs.txt for more info. + + +JFS. +~~~~ +IBM's JFS was merged for 2.6. (And backported to 2.4.20, but +it was still a new feature here first. You can read more about JFS at +http://www-124.ibm.com/developerworks/oss/jfs/index.html + + +XFS. +~~~~ +The SGI XFS filesystem has been merged, and has a number of userspace +features. Users are encouraged to read http://oss.sgi.com/projects/xfs +for more information. +The various utilities for creating and manipulating XFS volumes can +be found on SGI's ftp server: +ftp://oss.sgi.com/projects/xfs/download/download/cmd_tars/xfsprogs-2.5.4.src.tar.gz + + +CIFS. +~~~~~ +Support utilities and documentation for the common internet file system (CIFS) +can be found at http://us1.samba.org/samba/Linux_CIFS_client.html + + +FAT. +~~~~ +CVF (Compressed VFAT) support has been removed. This means you +will no longer be able to access DriveSpace partitions. + + +HugeTLBfs. +~~~~~~~~~~ +Files in this filesystem are backed by large pages if the CPU +supports them. See Documentation/vm/hugetlbpage.txt for more details. + + +Internal filesystems. +~~~~~~~~~~~~~~~~~~~~~ +/proc/filesystems will contain several filesystems that are not +mountable in userspace, but are used internally by the kernel +to keep track of things. Amongst these filesystems are futexfs +and eventpollfs. + + +Kernel Asynchronous I/O (AIO) Support +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Support for kernel AIO has been included in 2.6. + +AIO enables even a single application thread to overlap I/O +operations with other processing, by providing an interface +for submitting one or more i/o requests in one system call +(io_submit) without waiting for completion, and a separate +interface (io_getevents) to reap completed i/o operations +associated with a given completion group. + +The following is a quick summary of what works today as +expected: +- AIO read and write on raw (and O_DIRECT on blockdev) +- AIO read and write on files opened with O_DIRECT on + ext2, ext3, jfs, xfs + +And what doesn't work as expected or is not currently +supported: +- AIO read and write on files opened without O_DIRECT + (i.e. normal buffered filesystem AIO). On ext2, ext3, + jfs, xfs and nfs, these do not return an explicit + error, but quietly default to synchronous or rather + non-AIO behaviour (i.e io_submit waits for i/o to complete + in these cases). For most other filesystems, -EINVAL is + reported. +- AIO fsync (not supported for any filesystem) +- AIO read and write on sockets (doesn't return an + explicit error, but quietly defaults to synchronous + or rather non-AIO behaviour) + +You need to install libaio-0.3.92 (available at +http://www.kernel.org/pub/linux/kernel/people/bcrl/aio/) +if you are writing AIO applications which use the native +AIO interfaces. + +More info is available at http://lse.sf.net/io/aio.html + + +Profiling. +~~~~~~~~~~ +- A system wide performance profiler (Oprofile) has been included in 2.6. + With this option compiled in, you'll get an oprofilefs filesystem + which you can mount, that the userspace utilities talk to. + You can find out more at http://oprofile.sf.net/ +- You need a fixed readprofile utility for 2.6. + Present in util-linux as of 2.11z + + + +Improved BIOS table support. +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- Linux now supports various new BIOS extensions. + + +Simple boot flag support. +~~~~~~~~~~~~~~~~~~~~~~~~~ +The SBF specification is an x86 BIOS extension that allows improved +system boot speeds. It does this by marking a CMOS field to say +"I booted okay, skip extensive POST next reboot". +Userspace tool is at http://www.codemonkey.org.uk/projects/sbf/sbf.c +More info on SBF is at http://www.microsoft.com/hwdev/resources/specs/simp_bios.asp + + +EDD Support. +~~~~~~~~~~~~ +- Support for BIOS Enhanced Disk Drive Services (EDD) was added, + which exports information on what the BIOS thinks is the boot + drive and other useful info to /sys/firmware/edd +- Matt Domsch is interested in hearing success/fails on this code + with some simple tests decribed at http://linux.dell.com/edd/results.html + + +Improved system monitoring. +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +- lm_sensors. + - Shipped in vendors kernels for years, lm_sensors is now part of mainline. + It does however have a different interface. (/sysfs instead of /proc + - http://www.xs4all.nl/~thospel/ASIS/bin/psensors is a handy script + for parsing the new sysfs fields. +- IPMI. (Intelligent Platform Management Interface) + - IPMI is a standard for monitoring the hardware in a system. + - Project home page: http://openipmi.sourceforge.net + - Specification: http://www.intel.com/design/servers/ipmi/spec.htm + + +x86 CPU detection. +~~~~~~~~~~~~~~~~~~ +- The CPU detection code got a pretty hefty shake up. To be certain your + CPU has all relevant workarounds applied, be sure to check that it was + detected correctly. cat /proc/cpuinfo will tell what the kernel thinks it is. +- Likewise, the x86 MTRR driver got a considerable makeover. + Check that XFree86 sets up MTRRs in the same way it did in 2.4 + (Failures will get logged in /var/log/XFree86.log) +- Early PII Xeon processors and possibly other early PII processors + require microcode updates either from the BIOS or the microcode driver + to work around CPU bugs the O(1) scheduler exposes. + You can find the relevant microcode tools at + http://www.urbanmyth.org/microcode/ +- Any regressions in both should go to mochel@osdl.org Cc: davej@codemonkey.org.uk + + +Extra tainting. +~~~~~~~~~~~~~~~ +Running certain AMD processors in SMP boxes is out of spec, and will taint +the kernel with the 'S' flag. Running 2 Athlon XPs for example may seem to +work fine, but may also introduce difficult to pin down bugs. +In time it's likely this tainting will be extended to cover other out of +spec cases. + +Additionally, the new modules interface will taint the kernel if you try +to 'force' a module to load with insmod -f. + + +Power management. +~~~~~~~~~~~~~~~~~ +- 2.6 contains a more up to date snapshot of the ACPI driver. Should + you experience any problems booting, try booting with the argument + "acpi=off" to rule out any ACPI interaction. ACPI has a much more involved + role in bringing the system up in 2.6 than it did in 2.4 +- The old "acpismp=force" boot option is now obsolete, and will be ignored + due to the old "mini ACPI" parser being removed. +- software suspend is still in development, and in need of more work. + Use with SMP and/or PREEMPT not advised. +- The ACPI code will do basic sanity checks on the DMI structure in the BIOS + to determine the date it was written. BIOSes older than year 2000 are + assumed to be broken. In some circumstances, this assumption is wrong. + If you see a message saying ACPI is disabled for this reason, try booting + with acpi=force. If things work fine, send the output of dmidecode + (http://www.nongnu.org/dmidecode/) to acpi-devel@lists.sf.net + with an explanation of why your BIOS shouldn't be blacklisted. + +CPU frequency scaling. +~~~~~~~~~~~~~~~~~~~~~~ +Certain processors have the facility to scale their voltage/clockspeed. +2.6 introduces an interface to this feature, see Documentation/cpufreq +for more information. This functionality also covers features like +Intel's speedstep, and the Powernow! feature present in mobile AMD Athlons. +In addition to x86 variants, this framework also supports various ARM CPUs. +You can find a userspace daemon that monitors battery life and +adjusts accordingly at: http://sourceforge.net/projects/cpufreqd + + +Background polling of MCE. +~~~~~~~~~~~~~~~~~~~~~~~~~~ +The machine check handler has been extended so that it regularly polls +for any problems on AMD Athlon, and Intel Pentium 4 systems. +This may result in machine check exceptions occuring more frequently +than they did in 2.4 on out of spec systems (Overclocking/inadequate +cooling/underated PSU etc..). + + +LVM2 - DeviceMapper. +~~~~~~~~~~~~~~~~~~~~ +The LVM1 code was removed wholesale, and replaced with a much better +designed 'device mapper'. +- This is backwards compatible with the LVM1 disk format. +- Device mapper does require new tools to manage volumes however. + You can get these from ftp://ftp.sistina.com/pub/LVM2/tools/ + + +Debugging options. +~~~~~~~~~~~~~~~~~~ +During the stabilising period, it's likely that the debugging options +in the kernel hacking menu will trigger quite a few problems. +Please report any of these problems to linux-kernel@vger.kernel.org +rather than just disabling the relevant CONFIG_ options. + +Merging of kksymoops means that the kernel will now spit out +automatically decoded oopses (no more feeding them to ksymoops). +For this reason, you should always enable the option in the +kernel hacking menu labelled "Load all symbols for debugging/kksymoops". + +Testing with CONFIG_PREEMPT will also increase the amount of debug +code that gets enabled in the kernel. Kernel preemption gives us +the ability to do a whole slew of debugging checks like sleeping +with locks held, scheduling while atomic, exiting with locks held, etc. + + +Compiler issues. +~~~~~~~~~~~~~~~~ +- The recommended compiler (for x86) is still 2.95.3. +- When compiled with a modern gcc (Ie gcc 3.x), 2.6 will use additional + optimisations that 2.4 didn't. This may shake out compiler bugs that + 2.4 didn't expose. +- Do not use gcc 3.0.x on x86 due to a stack pointer handling bug. +- gcc 2.96 is not supported with CONFIG_FRAME_POINTER=y due to a stack + pointer handling bug. + + +Security concerns. +~~~~~~~~~~~~~~~~~~ +Several security issues solved in 2.4 may not yet be forward ported +to 2.6. For this reason 2.6.x kernels should not be tested on +untrusted systems. Testing known 2.4 exploits and reporting results +is useful. + +SELinux. +~~~~~~~~ +NSA Security-Enhanced Linux (SELinux) was merged in 2.6. +SELinux defaults to not being config'd in. If you +config it in it defaults to enabled. If you also config the bootparam +you can use that param to disable it, otherwise selinux=1 is redundant +as that's the default. + +You can obtain SELinux tools and an example policy configuration from +http://www.nsa.gov/selinux + + + +Networking. +~~~~~~~~~~~ +- ebtables + The bridging firewall code was merged. To manage these you'll + need the ebtables tool available from + http://users.pandora.be/bart.de.schuymer/ebtables/ + More on bridge-nf can be found at http://bridge.sourceforge.net +- Bridged packets can now be 'seen' by iptables. +- IPSec + Linux finally has IPSec support in mainline. Use the KAME tools port on + http://sourceforge.net/projects/ipsec-tools + For more info see http://www.lib.uaa.alaska.edu/linux-kernel/archive/2002-Week-44/1127.html + Also Bert Hubert has a howto at http://lartc.org/howto/lartc.ipsec.html + Additionally, ipsec-utils is at http://sourceforge.net/projects/ipsec-tools + Herbert Xu also has patches against FreeSWAN 2.00 to allow its userspace + to use the 2.6 IPSec functionality. They can be downloaded from + http://gondor.apana.org.au/~herbert/freeswan/ + An additional HOWTO is at http://www.ipsec-howto.org +- Some applications may trigger the kernel to spit out warnings about + 'process xxx using obsolete setsockopt SO_BSDCOMPAT' . + - Bind 9.2.2 checks for #ifdef SO_BSDCOMPAT in correctly, + so a recompile is all that is needed. + - bind9-host from debian testing triggers, though the 'host' package doesn't. + - process `snmpd' is using obsolete setsockopt SO_BSDCOMPAT + - process `snmptrapd' is using obsolete setsockopt SO_BSDCOMPAT + - ntop uses obsolete (PF_INET,SOCK_PACKET) +- Users of boxes with >1 NIC may find that for eg, eth0 and eth1 refer to + the opposites of what they did in 2.4. This is a bug that will be fixed + before 2.6.0. One option (or management workaround) for this is to use + 'nameif' to name Ethernet interfaces. There is a HOWTO for doing this at + +- Support for various new RFCs. + - RFC3173 (IP Payload Compression). + - RFC3041 (IPv6 Privacy Extensions). + - RFC2473 (IPv6 in IPv6 tunnels). + - RFC2960 (SCTP - see below). +- Linux reaches congestion collapse when subjected to heavy network load. + NAPI fixes this amongst other things and therefore improving network + performance. + More info at http://www.cyberus.ca/~hadi/usenix-paper.tgz and + ftp://robur.slu.se/pub/Linux/net-development/NAPI/ +- IPVS (IP Virtual Server) + http://www.linuxvirtualserver.org/ +- RFC 2960 - SCTP (Stream Control Transmission Protocol) + SCTP is an IP based, message oriented reliable transport protocol with + congestion control, support for transparent multi-homing and multiple + ordered streams of messages. RFC2960 defines the core protocol. + More information about the protocol can be found at + http://www.ietf.org/rfc/rfc2960.txt + and about the Linux kernel implementation at + http://lksctp.sourceforge.net +- ANSI/IEEE 802.2 LLC type 2 Support + Full implementation of LLC 1 and 2 stack, used by Appletalk, IPX and Token + Ring, also needed for the out of the tree, not yet functional NetBEUI + stack and for the for Linux SNA. + + This is based on the stack released under the GPL by Procom Inc. for the + 2.0.30 Linux kernel. + + +Crypto +~~~~~~ +- A generic crypto API has been merged, offering support for various + algorithms (HMAC,MD4,MD5,SHA-1,SHA256,SHA384,SHA512,DES,Triple DES EDE, + Blowfish, Twofish, Serpent, AES, CAST5, CAST6) + +- This functionality is used by IPSec and the crypto-loop. It's possible + that it will later also be available for use in userspace through a crypto + device, possibly compatible with the OpenBSD crypto userspace. + +- The in-kernel loopback device can now do crypto using the CryptoAPI. + May need new userspace tools. + +- A 2.4->2.6 cryptoloop migration guide is at http://clemens.endorphin.org/Cryptoloop_Migration_Guide.html + +Ports. +~~~~~~ +- 2.6 features support for several new architectures. + - x86-64 (AMD Hammer) + - ppc64 + - UML (User mode Linux) + See http://user-mode-linux.sf.net for more information. + - uCLinux: m68k(w/o MMU), h8300 and v850. sh also added a uCLinux option. +- The 64 bit s390x port was collapsed into a single port, appearing + as a config option in the base s390 arch. +- In the opposite direction, arm26 was split out from arm. +- x86 architecture also got 'subarch' support to support 'strange' x86 + boxes (usually big boy toys). Currently supported subarchs include + - ES7000 + - PC9800 (incomplete merge) + - VISWS (Was in 2.4, but now maintained again) + - Voyager. (http://www.hansenpartnership.com/voyager/) + + +TODO: + PCI IDs (new_id, agpgart try_unsupported) + libsysfs + kdev_t changes? + ISDN rewrite? + AFS + DVB + Hangcheck timer + /proc/sysrq-trigger + libata + initramfs ? _