Archive for the ‘Software and Software Development’ Category.

Windows 7

Last Friday, I discovered that I needed a 64-bit version of Windows for development purposes. I had a 64-bit version of Windows XP on a spare machine, but I haven’t set it up since the recent move, and until we get more stuff organized, I frankly don’t have the room. But it occurred to me that I’m running a 64-bit machine now, so at least in theory, I should be able to install a 64-bit version of Windows in a virtual machine under VMware Fusion. So I figure, what the heck, let’s give it a try. And while I’m at it, I might as well finally take a look at Windows 7 too.

MSDN supplied a 64-bit copy of Windows 7 “Ultimate”, and a key for it. When I created a virtual machine for it, VMware Fusion obligingly recognized the downloaded ISO file and installed it for me with no further instruction on my part. After the obligatory round of updates and reboots, I settled down to figure out how to use it.

In any new Windows OS, I have to figure out how to change a few things immediately. The “don’t show hidden files” option has to come off, and especially the “hide extensions on known files types” option — I’m a big boy, I can stand seeing file extensions and files that Microsoft thinks are too dangerous or scary for delicate eyes. I do not want it to automatically reboot if it crashes — I want to see that it has crashed, and any diagnostics that it produces, before I (manually!) reboot it. And turn off those damned animations, I don’t care about them and you’re running in a virtual machine so don’t waste the CPU cycles.

Since Microsoft always moves the settings around between versions, I had to poke around awhile to find them, and I got a good look at the OS as I did so. And I have to say, I’m impressed. The user interface is a lot better than it was before — easier to use, easier to navigate around, and it mostly stays out of your way until you need it. The system is pretty responsive, something that you couldn’t always say about earlier incarnations. It’s using roughly half of the one gigabyte of memory that VMware Fusion allocated to it, leaving half a gigabyte of memory for programs to use without swapping. Even the security stuff didn’t cause me much trouble, I just had to tell it that it was okay to run programs from a network drive without panicking and warning me about them each time.

But it wasn’t until I installed Visual Studio that I really saw the biggest difference. All of my programs compiled 33% to 50% faster! I’d anticipated that the compiler would be a little faster under a 64-bit operating system, despite the fact that it’s only a 32-bit program itself, but that’s a lot more than I’d expected!

All told, it was one of the most painless Windows installs I’ve ever done, and it didn’t take me too long to decide that this was going to be my new development VM. :-) I did run into one minor but baffling glitch, which I still haven’t determined the cause of (could be Windows 7, could be Visual Studio 2005, might even be something else entirely), but I found a work-around for it, so everything is working fine now.

I’m still not going to use Windows for anything Internet-related, only for games and software development. Linux is a lot safer on the ‘net at present. But for the foreseeable future, when I use Windows, I’ll definitely be using Windows 7.

Version Control: The Big Picture

During my most recent software development project, which I’ve already blogged about extensively, I finally learned to use a version control system (VCS from here on) right.

To fully understand this, you have to know where I was coming from. I learned to program in the early-to-mid eighties, and never even heard of a VCS until about 1998, when I picked up a copy of Visual C++ 6.0. I couldn’t see much use for it — I was archiving a zip-file copy of the source code for my software at every version release, and it seemed to me that VCS did the exact same thing.

At that point, I viewed the “working copy” of the source code as sacred. I’d quickly learned that whenever I made a change to it, I had to be able to reverse it if it didn’t work out, so for small changes, old code was just commented out and left there — that, and the zipped-up backups, were my VCS. It made for some messy code, because I rarely went back to remove the old commented-out stuff, but it worked.

When I sold my earlier software company in 2003 and went to work for the new company for a while, I was shocked at just how casually developers at the new company treated source code. They made changes to it willy-nilly, without leaving the old code intact! What if they screwed something up? The five-minute introduction to CVS that I got explained how to check in and check out source code, but nothing about the big picture of how or why to use it. To me, it was still just a glorified backup program.

Fast forward to a few weeks ago, when I started making some extreme source code changes to a large twelve-year-old project. That’s when I started really learning how to use version control.

The first lesson: the working copy is temporary. The sacred version of the source code is the one in the version-controlled repository, and if you’re using it properly, you can always fall back on it if your changes have introduced problems. You can also use it on multiple machines if necessary, with no problems, the source of a future compile-speed-related blog entry.

There’s a corollary to this: check in early, check in often. Make small and focused changes, get them working, and check them in as soon as they’re done. That way, if a change introduces a bug, you can easily locate the exact source-code alterations that caused it. And if all else fails, you can easily throw it away and revert to the last checked-in copy without losing valuable work. If I’d known this a few weeks ago, and practiced it, I could have saved several days of debugging huge check-ins when they later turned out to have problems.

And finally, if you have to make large, sweeping changes that must all be completed before you can see whether they’ll work, make a separate branch until it’s finished. Then you can keep checking in the smaller changes as they’re completed, even though they haven’t been tested, and all the check-ins on the main branch are known to always contain good working copies of the program. With a modern version-control system like Git, you can easily merge the branch into the main code once it’s done.

Now that I see the big picture, VCSs make a lot of sense. I’ll be using mine a lot more in the future, and quite differently.

Speedier Visual C++ Compiles

As I mentioned recently, I’ve been having some issues with the speed of Visual C++ compiles. Some adjustments to my VMware Fusion settings reduced the time it takes for a complete rebuild of my project, from roughly an hour and five minutes to about 43 minutes, but I knew there was still room for improvement.

Several of the sub-projects had monolithic header files that each source code file was including; I broke these up and set each source code file to only include the ones that it was actually using. That helped some, because fewer source code files had to be compiled most of the time, but it wasn’t enough.

Several of the sub-projects shared many source code files, and the same settings. The shared files were being compiled separately for each subproject, which was inefficient. I gathered all the common source code files into several static libraries, which eliminated the redundant compiles. Again, it helped, but not enough.

Finally, we get to the subject of precompiled headers.

I never had much use for precompiled headers. They never seemed to do much to reduce my compile times… in fact, they seemed to increase my compile times, which is why I had them turned off in all of these projects. But I found a page that explained the proper use of them (hint: MSVC’s default setup doesn’t use them right).

In a nutshell:

  • Create a single header file (I’ll call it precompiled.h here) which #includes only those header files that never (or almost never) change — windows.h, stdio.h, the STL library, the Boost library, etcetera. Add it to your project.
  • Create a source code file (I’ll call it precompiled.cpp, to go along with precompiled.h) that contains nothing but one line, #include "precompiled.h". Add it to your project too.
  • Add #include "precompiled.h" as the very first non-comment line in all source code files in the project. This is vital, because MSVC will ignore anything before it.
  • Set the entire project to use (not create!) precompiled headers, and set the PrecompiledHdrs.cpp file to create them.

The result: the compiles (after the precompiled.cpp file, when it needs compiling, which is seldom) are now ludicrously fast — at least ten times as fast as they were without precompiled headers! I assume most of that speed increase comes from the fact that I use a lot of stuff from the Boost library, but precompiled headers should help a lot even if you’re only using the raw Windows API functions (windows.h is huge, and pulls in a bunch of other header files too).

So compiles are now blazingly fast. Not as fast as I’d get on a good desktop machine (or even this one running on bare metal instead of virtualized), but much, much more tolerable.

My productivity on that project has soared. With those three changes combined, a compile usually takes less than a minute now, so I can try over fifty times more compiles in a day. I find myself a lot more willing to experiment with the project too, now that I know I’m not going to have to wait so long for the results. All in all, it was well worth the effort.

Crazy?

This is from yesterday’s Dilbert Blog entry:

A Muslim, a Christian, and a crazy guy walk into a room. The one thing you can know for sure is that at least two out of three of them organize their lives around things that aren’t real. And that’s the best case scenario. Atheists would say all three have some explaining to do. And atheists are the minority, which is the very definition of abnormal.

Hm… is computer software “real”? ;-)

VMware Fusion and Windows Development

I’ve spent the last few days integrating my new math library into the Windows project I started coding it for. Yesterday morning I was ready to try running the integrated copy, but I had some kind of problem starting it up. I couldn’t track it down very easily though, due to how long it took to do a full compile of that project.

As you might remember, I’m now using a MacBook Pro, and doing Windows development under VMware Fusion. I expected that things would be a little slower under VMware Fusion than on a bare-metal machine, and for the most part I didn’t really notice the difference… until I started working with this very large project again. I remembered it taking a long time to compile this one from scratch, but I remembered that “long time” as being closer to ten minutes. It was taking well over an hour now, which was simply ridiculous.

I’m also still running my Linux virtual machine at the same time, for web access and such. I had some trouble getting the two virtual machines to play well together… whenever I started the Windows VM, the Linux one suddenly started losing memory, for reasons I couldn’t understand. I tweaked the memory usage on both of them… it helped, but they still seemed to be fighting over memory at times, even though the system itself had plenty of memory for both of them and itself too, without swapping.

I finally found out the reason for all of that when I went to locate a fix for the slow Windows compiles yesterday. It turns out that VMware Fusion, unbeknownst to me, was sharing memory between the two, allowing the Windows machine to steal the Linux machine’s resources. (Just what I’d expect a Microsoft product to do. ;-) ). On VMware’s site, I discovered a page with a note at the bottom:

Adding the following settings to a virtual machine can reduce the I/O load on the hard disk, however these adjustments require additional memory on the host. Only add these settings if there is sufficient free memory on the host to accommodate all the memory allocated to the virtual machine, otherwise you may cause a memory starvation condition that can reduce performance of all the running virtual machines or possibly affect the host operating system.

I had plenty of memory, so bring it on! After shutting down the Windows VM, I added the four lines that the page suggested to the virtual machine’s vmx file:

    MemTrimRate = "0"
    mainMem.useNamedFile=false
    sched.mem.pshare.enable = "FALSE"
    prefvmx.useRecommendedLockedMemSize = "TRUE"

That helped, a lot. Not only with the speed of the compiles, either — the Windows system was no longer stealing the Linux machine’s memory. :-) But the compiles were still pretty slow, and it took me a while to discover why.

During a compile, I popped up the Windows Task Manager program, to see whether the system was swapping a lot of memory or something. And lo, what do I see, but that there’s another process who’s memory and CPU usage were going up while the compiler was trying to run. It turned out to be Microsoft Security Essentials, the free anti-malware program that I’d installed a little while back because the VPN software that one of the companies I work with uses insisted that I have some kind of virus protection. It was supposed to have no noticeable impact on the machine. Well, it had a very noticeable impact in this case.

After turning off its “real-time protection” option (and ignoring the dire warnings about how my computer was now At Risk!), full compiles were much faster. Still noticeably slower than on a similarly-equipped bare-metal system, but a lot more tolerable. I can’t say exactly how fast yet because I’m still improving it as I work on the code; I’ll run a thorough test once I’m done.

So equipped, I finally tracked down that startup problem — it was an instance of the static initialization order fiasco, which I should have thought of a lot sooner since I’d run into it just last week too. Oh well, live and learn. For now it’s back to the real task, developing the software.

Taming MSVC’s Intellisense

Microsoft Visual C++ 2005 (i.e. v8) is a decent compiler, with a decent IDE. But there’s one “feature” that I dislike passionately: Intellisense.

The idea behind it is a good one: that the IDE scans your source code every so often and figures out how certain things will be compiled so that it can tell you about parts of it when you ask — the type of a variable, the parameters to a function call, the members of a class, etcetera. And it works very well… on smaller projects, anyway.

On larger ones, like Project Badger, those every-so-often scans become productivity-sucking nightmares. During a scan, the IDE sucks up all of the CPU power of the machine, and it can take several minutes to complete. On my machine, the user interface becomes practically unusable during that time. And it scans fairly often… whenever you start up the IDE, whenever you switch from one project to another, or from one configuration to another (like debug to release), and sometimes with no apparent provocation at all.

I’ve tried to find a way to disable it before, using the oft-recommended method of renaming a DLL. It doesn’t work. It might eliminate some of the scans, but the thing still scans way too often. Yesterday, after I’d had to switch my project from release to debug (and suffer through a scan) once too often, I decided to check into it again.

Lo and behold, I found a solution!

In a nutshell: with the IDE closed, replace the *.NCB file in your project directory with a zero-byte file, and make it read-only. (You may also have to copy that read-only file to your temporary directory; I did so, but I don’t know whether it’s necessary.) The IDE will complain about it when it starts up, and offer to fix the “problem;” don’t let it. Without write-access to that file, the IDE won’t even try to scan, ever!

Of course, you won’t have access to Intellisense features on that project. But I usually find them superfluous anyway, and definitely not worth the time-cost in terms of productivity — I can look up the things that I might use it for myself and save all that unproductive scanning time.

Life is good again. :-)

Amusing Spam

Spam messages are rarely amusing enough for me to post anymore, but this one takes the cake:

Subject: Hegihten the qulaity of your ereictons with Soft (ialis.

Biggest_bIowout_sale of \/aIium in our onIine pharmacy

Sorry guys, but SpamBayes wasn’t fooled. You’ll have to do better than misspelled words and ASCII art to get past it.

Busy, Busy, Busy…

There was no entry yesterday because I was deep in a programming project. Very deep… I was up and at it at 8am, and other than a few very short breaks, kept going until 3am this morning. And that’s on top of about six hours on it the day before.

Around 1995, I started writing what’s known as a “multiple-precision integer arithmetic” package. In plain English, that’s a library of functions that can handle very, very large numbers — on the order of thousands of decimal digits, far more than the number types built into general-purpose computers can handle. There are at least half a dozen others available, but I wanted to write my own (not only to sidestep licensing issues, but also because it’s fun :-) ). The first versions were crude, and the interface is still fairly ugly, but I’ve been improving it sporadically ever since.

One of the things I’ve used it for is to build a public-key encryption system for one of my programs. It worked, though not quickly. But now I needed the ability to generate large prime numbers as quickly as possible, and the current version just wasn’t cutting it.

There were a few general speed improvements I could use that hadn’t been available when I wrote the earlier versions. Some of the algorithms I used in it, such as the one for division, were quick hacks that I threw together and just never replaced. Others I just hadn’t had the time to implement yet. And there were some speed-improving ones that I’d heard about but had never mastered.

One of those last is the Montgomery reduction, an algorithm that greatly speeds up certain math operations by eliminating the division steps. It’s an extremely elegant piece of work, but I’d never been able to figure it out because most of the descriptions on the ‘net seem to lack a few important details. For example, every one of them describes the actual “reduction” function in great detail, so I wrote it, thinking it would be an important part of the algorithm — turns out that that function is never needed. And testing the functions required doing a preparatory conversion step before it and a “deconversion” step afterwards to get the real result — but the details of those steps are almost impossible to find, I figured them out by trial and error once I understood what the algorithm was doing. (For the curious: to convert to Montgomery form, multiply the number by your R value, then take the modulus; to convert back, multiply the number by the modular inverse of R and take the modulus.) I learned a lot, and it was fun, but it was also a vigorous workout for my patience.

When I wrote the original version, the largest single-precision integer (the kind built into the computer, and the building block of every multi-precision integer package) was only 32 bits wide. Modern computers can handle much larger integers, up to 128 bits wide in some cases. Using larger integers means needing fewer steps for the same result, improving the speed of all multi-precision operations, so I wanted to take full advantage of them. I’d written the original code so that it could, in theory, handle any such type changes by simply altering a single line. But when I did that, just about everything broke. It took several hours, and more patience, to carefully step through the operations, figure out which ones weren’t working and why, and fix them all.

The last step was increasing the number of trial divisions. As a preliminary step to identifying prime numbers, I’d been trial-dividing candidate numbers by all the prime numbers below 100, which I’d figured out and hard-coded into the program. But according to Bruce Schneier’s Applied Cryptography, it was most efficient to trial-divide by all the prime numbers below 2,000, so I implemented the sieve of Eratosthenes to identify them, caching them so that the sieve only needed to be used once in any particular run.

The result of all that work: locating a particular prime number took 275 seconds (four minutes and thirty-five seconds) with the old code. The new code required only ten seconds! That’s nearly thirty times faster, and well within the time that I needed.

There’s still plenty of room to improve the library, like putting in the proper division algorithm, but it’ll do for now. :-)

Microsoft Finally Got The Memo

I’ve dealt a lot with software piracy issues, primarily with Project Badger (detecting and preventing piracy is one of its primary purposes). And I didn’t have to learn the hard way that you have to be very careful before calling any user a pirate, or allowing your software to do so — paying customers don’t like being accused of theft. If there’s any chance at all that you could be wrong about it, you have to give the customer the benefit of the doubt.

For some reason, Microsoft did have to learn that the hard way. Their first antipiracy attempt, three or four years ago, was secretly installed onto systems disguised as an “important security update.” And it “caught” far too many of their legitimate, paying customers, baldly and unapologetically calling them thieves. It was clumsy and heavy-handed, too — it essentially made the system unusable until the “caught” person called Microsoft to correct the problem. Even if you didn’t experience it yourself, it’s easy to see how that could royally piss people off.

But by the sound of it, they’ve fixed all of those problems. They’re going out of their way to be open and honest about the process now; the false-detection problems seem to be fixed; and when it does think that it has caught someone, it allows the system to continue working as normal, simply informing the user of the problem.

I still don’t particularly like Windows, but it’s easier to deal with it now.

“Microsoft shops to fly Win 7 minus SP safety ’shoot”

Or, to translate from the very British terminology: companies using Windows are planning to upgrade to Windows 7 before the first service pack. Of course, that’s hardly news to us in the hard-core tech-geek community, because we see Windows 7 as nothing more than a large (and very much needed) service pack for Windows Vista anyway.