Linux on Power – To BE or Not To BE… why should I care?

September 8th, 2015
Ron Gordon
Director of Power Systems
ron.gordon@mainline.com

 

Linux on POWER is rapidly accelerating in acceptance and adoption due to its outstanding performance, capacity to accommodate large workloads, options for virtualizations, and a proven track record in supporting mission critical applications. Another driving force is the increased availability of applications in support of customers evolving demands. What IBM Power Linux systems provide that is unique in the industry today is the ability to run Big Endian and Little Endian applications both unilaterally or in a mix. With Little Endian support recently added to IBM Power Linux Systems, both distribution availability increases (Ubuntu and SLES 12 are only Little Endian and now support Power Linux systems) and application creation and porting has been simplified. One of the technical aspects of this, however, is that the application Endianness must match that of the Linux distribution. Sometimes you may have a choice of application Endianness or wonder should your home grown application be compiled to Big Endian or Little Endian. Let’s investigate this.

Big Endian and Little Endian are data formats that define data in binary, with the most significant bits in the high order (Big Endian) or low order (Little Endian). Big Endian was the only data format for many years, supported by all systems and architectures. Then, x86 was “invented.” For some reason, they reversed the data bit order, and then we had Little Endian. As it turns out, only x86 is Little Endian but since x86 has the predominate market share, it is the most pervasive, at this time. There is a YouTube presentation on this subject as to why x86 changed Endianness. Endianness only pertains to data and not instructions. Compilers of code reflect the Endianness of the application with LE (Little Endian) being the default for x86 compiles, and all others defaulting to BE (Big Endian). POWER8 is an exception, in that compilers like XLC, GCC can accept a “compile to” definition of PPC or PPCLE. This would set the Endianness to BE or LE respectively. Now, when you boot a Linux distribution, the OS has to be LE to run LE compiled applications or BE to run BE compiled applications. In POWER8, everything actually runs in BE mode, and when data is loaded or stored to memory, an LE application has its data bit structure “flipped” in the registers…. so you are treating LE data correctly and transparently. Therefore, POWER8 is bi-Endian. POWER7 can only run in BE mode.

So, why do you care? For one, you should care because the application Endianness must match the Endianness of the Linux distribution. Ubuntu is LE only; SLES 11 is BE only; SLES 12 is LE only; RedHat 6.x is BE only; RedHat 7.1 has two distributions – one LE, the other BE. As you run applications as an objective, you should check the Endianness of the application, and then match it to the Endianness of the Linux distribution. One element you don’t have to worry about today is that both PowerKVM and PowerVM can support BE or LE virtual machines on POWER8. (You must be at the latest PowerVM level [V 2.3.3.50] to have this capability, as well as the appropriate micro code level.) Another element you don’t have to worry about is the POWER8 system model. You can run your Linux VM on either the Scale Out or Enterprise systems on PowerVM or run you Linux VM on Scale Out Linux only servers with PowerKVM or PowerVM or RedHat Smart Virtualization.

Should you invest in LE or BE? Actually, this is an interesting question. Since the real driver of “should I run BE or LE” is the Endianness of the application, and you cannot control that, it is good that you can run either, and concurrently, in a virtualized environment. By the way, JAVA and most interpretive languages (Perl, PHP, Python, etc.) are Endian neutral, so we are good for those applications. Compiled code or ISV code should state the Endianness, or required supported Linux, so you know whether to run on a BE or LE distribution. Today, most of the Linux on Power applications that exist (i.e., Websphere, DB2, Storix, etc) are compiled as BE for historical reasons. The availability support of LE on POWER is a great thing for these ISV application providers, since they can now code for Little Endian, if they need true bit order recognition. And, by just compiling for x86 and PPCLE, they will create a single data compliant application, with data portability between x86 and POWER architected systems. There are many application areas where the bit order is significant to the application and performance, such as device drivers, CAPI applications, and HPC applications. So, the ability to run BE or LE is a great capability for portability of a single source tree for ISVs. What will happen in the future?? I would suspect that LE becomes the predominate application compilation for Power Systems, since it is easier for solution providers, ISVs, and OpenSource to have only one source tree to support. I also believe that this will provide many additional applications to Power Systems running Linux that some were reluctant to provide previously, due to potential Endianness issues. Now, as an enterprise, it would seem that if you need to code applications and integrate them with data from x86, which is Endian specific, then compiling for PPCLE should be considered as the strategic direction. Note: Most modern data bases are Endian neutral, and when accessed by a BE or LE application, the data is presented in the proper mode based upon the database program.

Summary: Endianness is very important, since the Endianness of the application dictates the Endianness of the Linux distribution. JAVA and scripting languages, as well as most modern data bases, are really Endian neutral. But, as most future applications will be mainly LE (my prediction) for Linux on Power, this environment should be considered strategically.

Mainline