2010-08-20

intel, linux and hardware monitoring

Yes, it's time for another rant (plus I have some spare time at the moment). This time it's about Intel, the open-source poster-boy/girl. While I do applaud Intel for devoting some real resources for Linux support, it's somewhat sad that one important part has been lacking in serious attention: hardware monitoring.

You should take all of the content of this post as a personal opinion by me, especially because this post contains quite a number of speculative assertions.

Growing up with computers, I've tried to favor Intel motherboards (for at least the past 10 years), or at least Asus with Intel chipset. The reason is very simple: too many negative experiences with Linux and AMD-equipped motherboards. While I did quite like the processors from AMD, somehow they always managed to ally themselves with chipsets which were (for the lack of more fitting word) crap. VIA anyone? At least with Intel motherboards you could be fairly certain that Linux would run on them without too many problems.

Times change, markets evolve while quality devolves, and now the end result is that I've lost my love for Intel.

Besides the whole "Intel" GMA 500 debacle (I was a sucker), the quality of integration in the desktop motherboards has steadily decreased. Instead of using Intel's ethernet chips, you now have the same RTL/BCM crap that is used on el-cheapo motherboards. Instead of working SATA support, there was a phase where you could find pretty much any random PATA/SATA add-on chip on the motherboards. Obviously since SATA has wiped the floor clean of PATA, and Intel has got their act together, this isn't such a big issue as it was before. But my unwavering trust in Intel has.. wavered.

The remaining sore point for last years has been total lack of hardware monitoring support on Intel motherboards when using Linux. Previously, the sensor/monitor chips were connected to the chipset via straight I2C or SMBus (Intel "improved" I2C) or via ISA address accessed via a LPC-chip (which might've contained the ADCs/pulse counters internally).

I have no idea what went in the mind of Intel's hardware engineers when they decided to throw all the existing infrastructure away and replace it by something that they developed (and had to be better, right?).

Intel does claim that I2C (and hence SMBus) had the inherent problem of being unreliable in an electrically noisy environment such as the motherboard in a PC/server. Since I2C doesn't have built-in checksums or redundancy checks/corrections, it must be bad? Well, instead of making a new version of SMBus which would force the protocol to carry CRC, they decided to develop completely new signalling. Why? Patent/licensing lock-on of course.

So, together with ADI (Analog Devices Inc, now ON Semiconductor), Intel developed SST (Simple Serial Transport) that would "solve" all of I2C/SMBus problems. As Intel put it : "A bus was required to enable industry-wide compatibility with system management devices, such as temperature sensors and voltage monitors in computing applications". Yes. A bus. To connect industry wide compatible sensors. I2C? But how can you reap licensing fees from technology which is close to public domain? The specification for SST is probably available, but hidden behind NDAs so it won't be too helpful here. Shortly, it's a much higher frequency bi-directional serial bus which uses a mixed clock/data signalling similar to Manchester encoding but with some obvious twists in order to qualify as "original intellectual property".

As this wouldn't have been enough, at the same time Intel was ready to launch their TPM environment, which executes in various forms (now as as a separate MCU-part within the chipset, AFAIK). What would be better place to place the code that talks with the sensors than a secure sandbox within the chipset to which the end-user (and owner of the equipment) has no real access?

Intel pushed the solution as vPro, QST, AMT and various other marketing acronyms which changed with different processor and chipset combinations. Since the data is of actual value to corporate users (since home users don't need to monitor their systems it seems), there needed to be a way to access this data. Since all access to the protected sandbox needs to go via a single point of entry, HECI was born. An interface to talk with the sandbox (or the actual sandbox, the terminology isn't very clear since it's a fusion betwene marketing terms, obscure non-documenting documents and google lore).

Now, don't get me wrong Intel, but making it close to impossible to get useful environmental measurement data from the PC is only doing a disservice to you. But I guess you don't care. The feeling is mutual and from this day on I promise not to recommend your motherboards or chipsets over the competition.

Intel did try to attempt to push a HECI interface driver at some point. However, it was pretty much rejected the first time as there was no code that could use it. Once it was rejected, Intel managed to release the QST SDK for Linux as well, but it was already too late. I guess what happened was that the person working on the Linux support part either got a more fulfilling job or was transferred to do something else which was Important.

At 2.6.30, the HECI interface driver was added to the staging tree, but was removed at 2.6.32 (from staging) upon request from Intel. There have been some successes of using the QST SDK some time ago with older motherboards, but newer boards are probably beyond reach of open source still.

For amusement, you might check this bug report against lm-sensors. The closing entry in the ticket shows how much of good-will Intel has managed to gather during the past years of Linux support. Sadly, the times, they are a changin'. If you really want to give it a go, a good starting point might be the thinkpad wiki on AMT.

What's left then? ACPI? Using quality BIOS code? The day that ACPI is complete and bug-free in shipped products, I will eat my hat (I have several, and I will post a voting possibility so that a suitable hat may be selected).

I guess a solution which involves an AT90USB module with all the ADCs and counting logic, internal to the PC chassis might work. Not sure yet well driving the current to the various fans would be possible without too much of a power-hassle (I just haven't thought about it too much). Then, using internal USB connector, and talk to the measurement logic using regular userspace access. Any takers? (or even interest?).

No comments:

Post a Comment