Awkward conversations after a computer repair fails

Last month I had a three-day battle with a broken RAID setup. I eventually brought it back to life, returned it to the client and, after a few days, sent an invoice. Then, a couple of days ago, it broke again. They called me in.

It’s a tough one. It could be the power supply throwing the occasional wobbly. It could Random Motherboard Kak with the RAID chip. It could be user error. It could be something totally unrelated. Most of which are difficult to diagnose, and would involve trial and error see-if-it-lasts-this-time. They’ve decided to get a new computer instead. They also said they’d pay for my previous work, but made it clear they aren’t happy about doing so, as it “wasn’t repaired”. I think it unlikely they’d use me again.

Ugh. I hate these situations. I can see their point. But I did, in fact, get it working, and there was nothing to suggest the problem would recur. I didn’t charge anything extortionate, either. I’ll talk about it with a more knowledgeable friend to see whether there’s anything technical I should done differently, but that’s kinda irrelevant – there are certainly situations where this could happen through no fault of my own.

I suppose from my perspective they’re paying me to attempt a repair, but to them they’re paying for a repair. I tend to assume people are aware of the former, and although I try to explain what I’m doing, maybe I need to be much more explicit about it. 95% of the time they amount to the same thing, and of the remainder it’s usually something I can tell them about pretty quickly, and charge a nominal fee. But in this kind of situation, the difference becomes important. Maybe I need to get a properly-written contract, so I’m covered. I’m probably leaving myself wide open without one, to be honest.

In the end I caved and offered to fit a couple of components into a separate machine for no extra charge, which mollified them somewhat. But I still feel like I’ve messed up, one way or another.

Breaking 1TB

I remember getting my first hard drive that was bigger than 1GB, and thinking this was amazing. Today it takes half an hour to take 1GB of photos, so I obviously need much more space. Before this morning, my setup had one “500GB” and two “250GB” drives, but this actually added up to 927GB. This is because the manufacturers’ definition of a gigabyte differs from a computer’s definition of a gigabyte. Today I added a second “500GB” backup drive (I saw far too much data loss this week, and it scared me) and passed 1TB1 for the first time. Meaningless, but a little milestone nonetheless.

I’m trying to work out if I’ll ever hit the next milestone: 1024 terabytes = a petabyte. Let’s say I become a professional sports photographer, or something, and take 8GB of photos per day. Even with that, it’d still take 342 years to hit a petabyte. I’d need 55GB of photos per day to hit a petabyte within 50 years. For my camera that would be 6875 photos/day, while the most expensive Canon SLR, at 22MB/photo, would need 2500 shots. Nah, I can’t see individual photographers needing that much space for a long, long time.

  1. and 1 tebibyte []

Desperate solutions for dead hard drives

I’m trying to rescue a dying hard drive today. It’s suffering from the Click of Death, which means it’s going down no matter what, but it’d be really, really nice if I could get at its data.

I regularly deal with laptop hard drives. 95% of the time they’re slowly dying, and once a Windows system file conks out, I get called. This almost always turns out fine: I quickly copy the still-intact data, slap it all onto a new hard drive, and run a repair install / restore disk. But just occasionally the drives go downhill fast. In today’s case Windows broke at the weekend, and by the time I got there on Tuesday the drive was clicking. Clicking is not good – it means the drive is physically failing to read the data. If it won’t spin up, I can’t do anything.

There’s a solution, but it’s not cheap: you can send the disk to a data recovery centre. They’ll open the drive in their cleanroom and (I assume) transfer the data platters to something which reads them directly. Assuming the platters aren’t physically damaged, this will probably work well. But it’s very expensive – quotes this morning suggested ~£300 for a 40gb drive – and I don’t know anybody who’s actually done it. Because, with laptops, the lost data are usually sentimental rather than critical. It’s not worth that expense, but people are still sad to lose it. This sucks.

I hate it when I can’t recover data. Obviously, everyone should have backups etc., but saying so is all well and good – in practice, most people don’t1. And it’s still heartbreaking to lose, say, years of photographs. But there is one last, desperate trick you can try before paying a fortune / giving up. Put the drive in the freezer.

Honest. It contracts the metal, and has been known to bring drives back from the dead. Until they warm back up…but I only need 15mins for a drive image. I’m trying this today.

The drive in question refused to stop clicking, so I shoved it in the freezer for an hour. I then quickly slapped it into an external usb caddy, hit the power and…I’m pretty sure it span up. Laptop drives are very quiet, but if I tilted it there was a definite force, so something was happening. Windows said “I’ve found a drive!”. And then sat there. And sat there. I reset the enclosure to try and kick things back into life, and this set it clicking again. Damn.

As I said, this is a last-ditch strategy. I’m really hoping that a bit longer in the freezer will do the trick – some say they’ve had drives fail after 4h but work after 24h. I’ll give it another few hours and try again. If that doesn’t work, I’ll try it overnight. I’d really like to get this one.

Update after another 2hrs: still nothing. It spins up, then starts clicking. Can’t think there’s much hope, to be honest, except there was that all-too-brief ‘disk drive found’ message from Windows…

Update 2: Sadly, this didn’t work. After an overnight freeze it refused to do anything for a minute, then just clicked as ever. I guess this type of click wasn’t the freezer-solvable one. Damn.

  1. backing up is still far too irritating for the average user, if you ask me. Norton Ghost is the most user-friendly system I know, but it’s confusing to set up. This should hopefully change as broadband speeds increase, as far simpler online backups will be able to handle music / photos []

Not my finest hour

I totally messed up this weekend. I spent the whole time on my own, trying to fix that RAID array, and pissed off at least two groups of people I was supposed to meet up with. The computer’s finally all working as of 0130, but I can’t possibly charge for all the time I spent. I hate being beaten by problems, is the thing, and I have a bad habit of taking it personally when I can’t figure things out. But this was just silly, and I crafted a situation with no upside. Damn it.

Trying to recover deleted RAID partitions

I’ve been grappling with a broken RAID setup this weekend. I was given the computer with little more than “it’s broken”, and it’s taken a while to diagnose.

It wasn’t booting. It got so far as ‘listing pci devices’ and conked out. Usually you’ll see an error in such situations, but this one, helpfully, just hung. This was when I discovered the RAID0 setup. As far as I can tell, it came from the store with this configuration, which is stupid. RAID0 sucks. It lets you link multiple drives into one big space, and I think there are speed benefits, but this is all outweighed by the data being dependent on all the drives staying healthy. If any drives fail, you lose everything. Not good.

But the drives were fine: both passed a sector scan without issue. The RAM checked out too. For a while I thought it might be a boot sector thing, then eventually I slipstreamed an xp disc with the required RAID drivers, and the initial install process reported no partitions. Ok – maybe they got deleted somehow. But how best to investigate? Usually this is easy – just whack the drive into another computer, and run whatever data recovery is appropriate. But RAID is finicky, and I was wary. One wrong move and you’ve broken the array and made data recovery infinitely more difficult. I really wanted to leave the drives alone as much as possible.

Eventually I shoved in another drive, installed XP onto it (which wasn’t without evil BSOD complications), hooked up the RAID and ran Active@ Partition Recovery. This took an hour to find two deleted partitions, one of which contained all the user data – perfect! I hit the ‘Recover’ button and Active@ said ‘Please pay for the full version’. Now, I’m sure there’s freeware that can undelete partitions. I’m sure I could even do it manually, if I did the research. But the hell with it – the ‘recover’ button was right there, so I paid the £27 for the full version. This fixed the mbr and boot sectors, and mounted the drive in Windows.

Windows said ‘wtf something is b0rked here’. The partition was back, and Active@ could list its files, but Windows couldn’t quite figure it out. This is the kind of thing which at which Scandisk excels. It usually works very well. But occasionally it’ll break things beyond belief, and a backup is advisable. So I switched to my favourite data recovery program: Restorer 2000 Pro. This little utility has saved me many, many times over the years. It scanned the major partition, and has spent the last six hours transferring all the data to yet another drive.

I’m currently waiting for scandisk to complete. I think it’s adding index entries to every file on the disk. Either that or it’s stuck in an infinite loop. Time will tell.

Charging for this kind of work is always difficult. Half the time is spent waiting for scans to complete or data to transfer – I’ve got through half of The Diamond Age this weekend – but it’s not like you just leave it running, either: there’s always some query that means you have to check it every five minutes (Restorer 2000, for example, has a strop if you try to recover too many directories from the root at once, so you have to be on hand to manually start the process every quarter of an hour). Charging a full hourly rate would obviously be hideously expensive and morally wrong, but you obviously don’t want to feel like you’re wasting your time. You also can’t always predict how long something will take, so you can’t say to the client “I’ll do £x amount of work then give you a call”. It just doesn’t work that way – oftentimes stopping halfway through would mean leaving the computer in an even worse state. I tend to add it up and see what feels reasonable. I’m not going to charge more than the computer’s worth, even if the job has taken that long. I know people who tell me I’m wrong, but most of my work is for individuals with their home computers, and I don’t think it’s fair to charge silly money.

Ho hum. Scandisk is still indexing, and the drive’s chugging. Man, I really hope it’s doing something useful.

RpcSs killing processes in Windows 2000

For the last two days I’ve been struggling with a particularly irritating computer problem. I was called on Monday morning to say a Windows 2000 machine had a virus. An initial glance suggested spyware was killing processes: Explorer worked fine, but anything else – task manager included – was shut down immediately. This is pretty standard stuff for spyware, and I didn’t anticipate much trouble. Sadly, I was wrong.

I deleted an obvious ‘Windows Antispyware 2008’ to no effect, and virus / anti-spyware scans revealed nothing. I shut down all the non-essential services I could find, and even ran a quick scan for rootkits, but couldn’t find anything.

The problem was also there in Safe Mode, but not, I discovered by total chance, in Safe Mode with Networking. That was weird. The latter *should* just be the former + a network driver. This seemed consistent, then it happened once in SFw/N, and I started to think it might be hardware.

Admittedly it all felt a bit specific for that – you’d think hardware would kill everything, not just certain programs – but it could be to do with power draw. Plus, PSU problems have been known to have very weird symptoms. But a test PSU made no difference, the RAM checked out fine, and the (8-year-old) hard drive passed its fitness test. I thought I was onto something when I spotted the cpu fan slowing down and stopping in everything but SFw/N, but this was a red herring1.

I eventually tracked it down by comparing the running processes in Safe Mode and Safe Mode w/ Networking (by repeatedly opening task manager and writing down names before it got nuked). The former, bizarrely, had an extra svchost.exe running. svchost.exe is a generic holder for background programs, and I needed more details. This is easy enough in XP, but in Windows 2000 you need the tlist support tool. The process turned out to be RpcSs: Remote Procedure Call. This was a new one on me, but it essentially controls background communications between programs. Disabling it solved the problem, but created a thousand more.

Turns out, RpcSS is vital. And here’s where I got stuck. I just couldn’t find any elegant ways to fix it. RpcSS is too low-level and important, and can’t simply be reinstalled. Eventually I went with the old-school Magic Fix: the repair install. This just installs Windows over the top of itself, and while it’s often equivalent to using a sledgehammer to crack a wotsit, it generally solves the problem. Not this time. Windows died, and wouldn’t come back. In the end I was forced to reinstall from scratch, which is always the last resort2.

That’s really irritating. Usually, the hard part is diagnosing the problem. Once I know what’s going wrong, it’s just a matter of research and thinking it through. It’s rare that I can know what’s wrong but be unable to do anything about it. My best guess is the initial spyware somehow took out RpcSS. Windows 2000 is a bit old-and-busted now, and I’m hoping XP is better secured against such things.

I’m mainly blogging this for googlers facing similar issues. I couldn’t find any references to problems manifesting in Safe Mode but not Safe Mode with Networking. Very odd one.

  1. the motherboard was actually slowing down the processor so it could disable the fan and keep things quiet. I turned this off. []
  2. Also I’d forgotten Windows 2000 comes with IE5.0. Ugh. []

spyder2express review

(This follows on from a post yesterday)

Succinct review: It rocks!

Longer review: I have two monitors. One is a 21″ widescreen Dell and one is a 15″ Samsung which I pilfered after my parents’ accounts assistant retired. The former is a high-quality monitor by average-consumer standards, the latter is decidedly not. I was intrigued as to how they’d fare when calibrated properly, and once the spyder arrived I eagerly unpacked it, read through the instructions and installed the software. Immediately there was a problem: it jumped straight onto my secondary monitor, and wouldn’t move. The only way around this was to disable the monitor, which was probably sensible anyway. I plugged in the calibrator and the software prompted me to hang the unit over the screen:

spyder2express

The obvious first reaction is ‘OMGHEADCRAB‘. The second is ‘what’s it going to do, then?’. I started the calibration process, and it spent the next eight minutes displaying varying shades of red, green, blue and grey. It told me to remove the unit, then said ‘I’m done’ and showed a before/after switch. I flicked it.

It turns out my Dell is pretty good. The before/after warmed up the display a little – everything went slightly more orange than before – but nothing major. I didn’t care – the point was to get a calibration and put my mind at ease, and whether the adjustment was big or small was irrelevant. And now this was done – great! The software automatically told my Dell to use the spyder2express colour profile, and that was that. But the niggling voice at the back of my head said ‘how do you know? It’s a cheap calibrator – how can you be sure it’s done a good job?’. So I came up with a plan to test it.

The next step was calibrating the Samsung. This would generally be problematic as the spyder2express only supports one monitor – it will overwrite its previous profile if you re-run the calibration, and the files can’t be simply renamed in Windows Explorer1. But I knew you could be sneaky and install the Microsoft Color Applet. This Control Panel extension gives you more sophisticated control over colour profiles, including the ability to rename them. So I renamed the Dell profile2, then disabled my main monitor and ran the calibration on the Samsung.

This time: massive difference. The before/after button took me from cool blue to warm orange. 24h later I’m still noticing that the display is very different.

Now, two monitors provide a good test of the spyder’s abilities – since both monitors are calibrated, I should be able to put the same image up on both screens and see no difference. This was the moment of truth: if the colours are identical, the unit works as advertised. If they’re off, its quality is variable. And how much it’s off indicates how good a unit it really is. So I fired up the Lightroom 2.0 Beta, with its swish multi-monitor support, and opened a photo on both monitors simultaneously.

Result: to my eyes, identical. I could see contrast differences between the monitors, but there was no difference in the colour. Brilliant. I spent ages playing around in Lightroom, changing white balances and whatnot, and it remained consistent no matter what I threw at it.

Yesterday afternoon I took the spyder to my parents’ office. My Dad’s monitor is notoriously bad – the colours have always been washed out, and I have to keep its brightness low to prevent photos posterising. I was skeptical the spyder could do much with it, to be honest, but ran it anyway. The before/after showed a huge difference. Everything was again much warmer, but somehow less bright, too. I checked out his Picasa and photos looked much, much better, but I also found I could up the brightness much more without any posterising. I’m amazed at the difference. Mum’s monitor was better, and became a little warmer, but nothing too drastic. I brought up the same flickr page on both machines, stepped back, and they looked exactly the same. Excellent.

I was worried the spyder would be a disappointment. I was prepared to pay to get better colors than before, but I was hoping it wouldn’t be just mediocre. And it isn’t – I really couldn’t ask for anything more. I suppose the final test comes when I send some images off to the sRGB printers, but I’m confident – if it can produce perfectly matching colours on my monitors, there’s no reason to think it’s not matching the specifications. I spent most of today editing photos from my niece’s Naming Day last weekend, and it’s great to know that my parents will be seeing the same colours I am.

A bit more geeky stuff follows.

Continue reading spyder2express review

  1. possibly – I’ve seen conflicting reports about this []
  2. I kept getting in-use errors, so I actually ended up copying the profile in Windows Explorer, then renaming this copy in the Applet []

Colour problems with my photographs #2

Colours are annoying, particularly when you’re messing around with digital photos. If I email a photo to ten people at ten different computers, they’re all going to see slightly different colours. This is because every monitor has unique quirks in its colours. It’s a trade-off of non-professional consumer hardware, and is perfectly reasonable – most people don’t need to worry about how exactly their photos will appear on other machines. Unfortunately, I am no longer one of those people.

For example, I want to make a Blurb book of my Year 25 photos, and I’d obviously like the colours I see on-screen to be very close to the final result. Now, Blurb print their colours according to the sRGB standard. sRGB is a widely-used database of colour values: any two printers, if calibrated to this database, should print the exact same colour if I say ‘dudes, print me some green’. And computer monitors can be calibrated too – if I can ensure the green on my monitor matches the green in the sRGB specification, problem solved! But my monitor isn’t calibrated – I have no idea how well the colours on my screen match the sRGB colours. If my monitor is rubbish at green and displays them darker than it should, I’m going to get a Blurb book in which all the greens are too light.

So the question becomes, how do I ensure that I’m seeing the right colours? How can I calibrate my monitor? It’s possible to alter the colour balance in Windows, but that doesn’t help – Windows only knows it’s telling the monitor to display green, it can’t tell what colour my monitor is actually showing.

Thankfully, there’s an easy solution: I need to buy a hardware calibrator. This is a device that physically looks at the monitor while the computer displays a pre-determined series of colours. The accompanying software analyses the calibrator’s data and determines the difference between the theory and the reality. Then comes the clever bit – it adds a layer between the image and the monitor, called a colour profile. So your photo says ‘I am green’, then the colour profile says ‘right, I know that this particular green will come out too dark, so I’m going to tell the monitor to display a lighter green – one that will show a truly representative colour’. The photo isn’t changed at all, but the colour profile ensures you’re seeing the correct colours1.

Unfortunately, a decent hardware calibrator costs £130. I can’t justify that for something I’m going to use once. But I’ve had various paying photo jobs recently(!), and I’ve become increasingly paranoid that colour-matching will bite me in the ass at some point. What if my not-too-shabby-but-getting-on-a-bit Dell 2004fpw is way out? I’ve had pictures printed before and they’ve been close enough, but what if the lab optimised them to fix the problems?

Then I discovered the Spyder2express hardware calibrator.

It’s a cheaper version of the £130 recommended-everywhere Spyder2. It only supports one monitor. There’s little in the way of configuration. But the reviews say it does a good job and actually uses the same hardware as its more expensive siblings – it’s the software that’s crippled, and the results aren’t necessarily as good as the fancier models. It also has the major advantage of ‘only’ costing £62 inc. delivery.

My paranoia got the better of me. I didn’t want the worry, and I figured pretty-close-but-not-perfect was much better than hope-things-turn-out-ok. I bought one. It arrived this morning.

Now, I knew this wasn’t going to be the most exciting purchase ever, and I was preparing myself for the anticlimax. I figured I’d use it once, then sit there looking at my £65 and wonder whether I’d made a mistake. Review tomorrow.

  1. this always reminds me of the Hubble space telescope, which had a problem with its mirror after it was first launched, meaning it produced fuzzy images. They couldn’t replace the mirror in situ, so they designed a filter that exactly reversed the mirror’s flaw, and stuck it between the mirror and the CCDs. It worked. []

eee: eee!

Yesterday’s post brought me two toys:

Toys

The ‘brella is mine. The laptop I’m setting up for a friend. But this is no ordinary laptop, this is an eee pc. Alice of the wonderful Wonderland got one a while back, and her initial possible-typo thought has been ringing around my head for 48hrs, because it sums the thing up perfectly: IT TITCHY! Here’s a better picture, actual size1:

eee

See? It titchy! I’m in love. It’s 23 x 17cm and in its case weighs 976g, which isn’t much more than a large book, or my camera. It has wifi, 512mb RAM, three USB slots, a 3hr battery, a VGA port, an SD-card slot, two speakers and a webcam. It runs linux, boots in 15 seconds, shuts down in 5 and comes with OpenOffice.org, Firefox and Skype. Best of all, it only cost a shade over £220 – brand new.

Clearly there’s a compromise somewhere, and it’s mainly in power and disk space. It’s not at all fast – 630Mhz – and the hard drive is only 4gb2. Plus, the screen resolution is only 800×480, being as how it’s only 7″ on the diagonal. But if all you want to do is surf, type and chat, you don’t need any more than that. Couple this thing with an apparently-compatible Huawei PAYG Mobile Broadband stick and you’ve got 1mbps internet access you can throw into your bag just in case. Is brilliant.

The keyboard is obviously tiny tiny tiny, and takes some getting used to. But it’s at least a standard layout, and I adapted pretty quickly. The mouse ‘buttons’, it has to be said, are godawful, but thankfully the trackpad supports tapping. The machine recognised my USB drive straight away and I was able to transfer files from my XP machine without issue3. The screen is just large enough that text is readable without straining, but it’s close.

The menu system is fairly unexceptional, and buries the good stuff in with a load of less-than-useful programs, but does the job. It’s not officially editable, but activate the ‘advanced mode’ and you’ve got the full configurability4 of linux. About which I know nothing, but I had a crack anyway. The machine is popular enough that the eee wiki has many, many guides on unlocking advanced features without screwing everything up, and I went through a few step-by-step. The instructions suffer from the usual crowd-sourced documentation problems in that they can veer from incredibly useful to ‘oh, and before you do the next step you’ll need to rebuild the kernel – once you’ve done that…’, but are on the whole good. It has a problem out-of-the-box that prevents it from connecting to wireless networks that have WPA keys containing spaces; I was able to fix this by overwriting a couple of system files. I also tidied up the default layout, upgraded to OpenOffice.org 2.3, and enabled the option to boot into KDE. You can do much more – and for £40 you can upgrade it to a touchscreen(!) – but as it’s not mine I stopped there.

I’d be saving up for one, but it’s no use at all for anything photographic. Sure I could probably shove the GIMP on there, or even try XP and Photoshop if I thought I could handle the speed, but the 4gb drive is just too small. My camera’s memory card is twice that, so it’d be no use for backing up ‘in the field’, and I can’t imagine that editing a 3888×2592 file on that screen would be much fun. The eee has inspired a whole host of other micro-laptops, but they all seem to be coming in far more expensive, sadly.

My friend spends 4hrs on a wifi-enabled bus every day, but gets fed up of lugging a full-size laptop around. This should be a perfect solution, and I have to hand it over tomorrow. I am sad. I’ve named it and everything. Still, at least I have a ‘brella.

  1. not true []
  2. although it’s a blindingly-quick solid-state drive []
  3. although god only knows how you find the drive in the xandros open/save menus if you don’t click ‘open in file manager’ immediately []
  4. is this a word? []

A weekend of playing with Cisco

I’ve been exhausted today, after a heavy weekend. A friend invited me to help install and configure a startup’s network, and both nights neither of us got to sleep until 0300.

The company had quite the setup: 24″ monitors, VoIP phones, a beautifully-sunlit open-plan office, Aeron chairs, the lot. Their building had network wiring already, and it was our job to get everything connected and talking to each other (or not, if you’re a VoIP phone and a PC). I’ve never configured anything quite so high-end before. We had Sawyer the 24-port gigabit ethernet switch (brawn, didn’t need to do anything fancy), Jack the 24-port fast-ethernet switch (less powerful, but needed to do clever routings) and Hurley the wireless router (wireless = the cool bit) all connecting to Kate the ultra-configurable mega-secure Cisco router (ultimately in charge, and physically under both the switches). Everyone needed internet access, and it all had to work via DHCP – all settings being supplied automatically once connected to the wall / wireless. Each component threw up problems at times, and it was quite the challenge.

As ever, the toughest problems were sometimes the fastest – denying intra-subnet communication took five minutes, despite being a major worry – while the insignificant things ate up time – the network printer Just Didn’t Respond, and took two hours to fix. At times we delved into Cisco’s formidable command-line-interface, and discovered various deficiencies in their generally ultra-swish GUI. We also ate a lot of muffins. And bon-bons.

By 0130 on Monday morning everything was wired up and talking to each other. It was quite the relief! Today we heard nothing until this evening, when a call said everything had run fine. This is pretty rare – there’s always something broken – and we’re concerned they’re using next door’s wireless.

There was a hell of a learning curve and the pressure got to us both at times, but it was great fun nevertheless. I’ve also grown quite fond of Cisco routers. You might need a degree in jargon to configure the things, but they’re seriously powerful toys.