MMX technology was originally named for multimedia extensions, or matrix math extensions, depending
on whom you ask. Intel officially states that it is actually not an abbreviation and stands for nothing
other than the letters MMX,however, the internal origins are probably one of the preceding. MMX tech-nology was introduced in the later fifth-generation Pentium processors as a kind of add-on that
improves video compression/decompression, image manipulation, encryption, and I/O processing—
all of which are used in a variety of today’s software.
MMX consists of two main processor architectural improvements. The first is basic: All MMX chips
have a larger internal L1 cache than their non-MMX counterparts. This improves the performance of
any and all software running on the chip, regardless of whether it actually uses the MMX-specific
instructions, as well as a new instruction capability called single instruction, multiple data (SIMD).
Modern multimedia and communication applications often use repetitive loops that, while occupying
10% or less of the overall application code, can account for up to 90% of the execution time. SIMD
enables one instruction to perform the same function on multiple pieces of data, similar to a teacher
telling an entire class to “sit down,” rather than addressing each student one at a time. SIMD enables
the chip to reduce processor-intensive loops common with video, audio, graphics, and animation.
Intel also added 57 new instructions specifically designed to manipulate and process video, audio, and
graphical data more efficiently. These instructions are oriented to the highly paralleland often repeti-tive sequences frequently found in multimedia operations. Highly parallelrefers to the fact that the
same processing is done on many data points, such as when modifying a graphic image. The main
drawbacks to MMX were that it worked only on integer values and used the floating-point unit for
processing, so time was lost when a shift to floating-point operations was necessary. These drawbacks
were corrected in the additions to MMX from Intel and AMD.
Intel licensed the MMX capabilities to competitors such as AMD and Cyrix, who were then able to
upgrade their own Intel-compatible processors with MMX technology.
In February 1999, Intel introduced the Pentium III processor and included in that processor an update
to MMX called Streaming SIMD Extensions(SSE). These were also called Katmai New Instructions(KNI)
up until their debut because they were originally included on the Katmai processor, which was the
code name for the Pentium III. The Celeron 533A and faster Celeron processors based on the Pentium
III core also support SSE instructions. The earlier Pentium II and Celeron 533 and lower (based on the
Pentium II core) do not support SSE.
The Streaming SIMD Extensions consist of 70 new instructions, including SIMD floating point, addi-tional SIMD integer, and cacheability control instructions. Some of the technologies that benefit from
the Streaming SIMD Extensions include advanced imaging, 3D video, streaming audio and video
and speech-recognition applications.
The SSExinstructions are particularly useful with MPEG2 decoding, which is the standard scheme
used on DVD video discs. Therefore, SSE-equipped processors should be more capable of performing
MPEG2 decoding in software at full speed without requiring an additional hardware MPEG2 decoder
card. SSE-equipped processors are also much better and faster than previous processors when it comes
to speech recognition.
One of the main benefits of SSE over plain MMX is that it supports single-precision floating-point
SIMD operations, which have posed a bottleneck in the 3D graphics processing. Just as with plain
MMX, SIMD enables multiple operations to be performed per processor instruction. Specifically, SSE
supports up to four floating-point operations per cycle; that is, a single instruction can operate on
four pieces of data simultaneously. SSE floating-point instructions can be mixed with MMX instruc-tions with no performance penalties. SSE also supports data prefetching, which is a mechanism for
reading data into the cache before it is actually called for.
SSE includes 70 new instructions for graphics and sound processing over what MMX provided. SSE is
similar to MMX; in fact, besides being called KNI, SSE was called MMX-2 by some before it was
released. In addition to adding more MMX-style instructions, the SSE instructions allow for floating-point calculations and now use a separate unit within the processor instead of sharing the standard
floating-point unit as MMX did.
SSE2 was introduced in November 2000, along with the Pentium 4 processor, and adds 144 additional
SIMD instructions. SSE2 also includes all the previous MMX and SSE instructions.
SSE3 was introduced in February 2004, along with the Pentium 4 Prescott processor, and adds 13 new
SIMD instructions to improve complex math, graphics, video encoding, and thread synchronization.
SSE3 also includes all the previous MMX, SSE, and SSE2 instructions.
SSE4 (also called HD Boost by Intel) was introduced in January 2008 in versions of the Intel Core 2
processors (SSE4.1) and was later updated in November 2008 in the Core i7 processors (SSE4.2). SSE4
consists of 54 total instructions, with a subset of 47 instructions comprising SSE4.1, and the full 54
instructions in SSE4.2.
Advanced vector extensions (AVX) was introduced in January 2011 in the second-general Core i-series
“Sandy Bridge” processors, and is also supported by AMD’s new -Bulldozer- processor family. AVX is a
new 256-bit instruction set extension to SSE, comprising 12 new instructions. AVX helps floating-point intensive applications such as image and A/V processing, scientific simulations, financial analytics, and 3D modeling and analysis to perform better. AVX is supported on Windows 7 SP1, Windows
Server 2008 R2 SP1, and Linux kernel version 2.6.30 and higher.
3DNow! technology was originally introduced as AMD’s alternative to the SSE instructions in the Intel
processors. It included three generations: 3D Now!, Enhanced 3D Now!, and Professional 3D Now!
(with full support for SSE). AMD announced in August 2010 that it was dropping support for
3D Now!-specific instructions in upcoming processors.