Below are the five most recent posts in my weblog. You can also see a chronological list of all posts, dating back to 1999.
I'm long overdue writing about what I'm doing for my PhD, so here goes. To stop this getting too long I haven't defined a lot of concepts so it might not make sense to folks without a Computer Science background. I'm happy to answer any questions in the comments.
I'm investigating whether there are advantages to building a distributed stream processing system using pure functional programming, specifically, whether the reasoning abilites one has about purely functional systems allow us to build efficient stream processing systems.
We have a proof-of-concept of a stream processing system built using Haskell called STRIoT (Stream Processing for IoT). Via STRIoT, a user can define a graph of stream processing operations from a set of 8 purely functional operators. The chosen operators have well-understood semantics, so we can apply strong reasoning to the user-defined stream graph. STRIoT supports partitioning a stream graph into separate sub-graphs which are distributed to separate nodes, interconnected via the Internet. The examples provided with STRIoT use Docker and Docker Compose for the distribution.
The area I am currently focussing on is whether and how STRIoT could rewrite the stream processing graph, preserving it's functional behaviour, but improving its performance against one or more non-functional requirements: for example making it perform faster, or take up less memory, or a more complex requirement such as maximising battery life for a battery-operated component, or something similar.
Pure FP gives us the ability to safely rewrite chunks of programs by applying equational reasoning. For example, we can always replace the left-hand side of this equation by the right-hand side, which is functionally equivalent, but more efficient in both time and space terms:
map f . map g = map (f . g)
However, we need to reason about potentially conflicting requirements. We might sometimes increase network latency or overall processing time in order to reduce the power usage of nodes, such as smart watches or battery-operated sensors deployed in difficult-to-reach locations. This has implications on the design of the Optimizer, which I am exploring.
Last Saturday I joined roughly 65,000 other people to see the Cure play a 40th Anniversary celebration gig in Hyde Park, London. It was a short gig (by Cure) standards of about 2½ hours due to the venue's strict curfew, and as predicted, the set was (for the most part) a straightforward run through the greatest hits. However, the atmosphere was fantastic. It may have been helped along by the great weather we were enjoying (over 30°C), England winning a World Cup match a few hours earlier, and the infectious joy of London Pride that took place a short trip up the road. A great time was had by all.
Last year, a friend of mine who had never listened to the Cure had asked me to recommend (only) 5 songs which would give a reasonable overview. (5 from over 200 studio recorded songs). As with Coil, this is quite a challenging task, and here's what I came up with. In most cases, the videos are from the Hyde Park show (but it's worth seeking out the studio versions too)
1. "Pictures of You"
Walking a delicate line between their dark and light songs, "Pictures of You" is one of those rare songs where the extended remix is possibly better than the original (which is not short either)
2. "If Only Tonight We Could Sleep"
I love this song. I'm a complete sucker for the Phrygian scale. I was extremely happy to finally catch it live for the first time at Hyde Park, which was my fourth Cure gig (and hopefully not my last)
The nu-metal band "Deftones" have occasionally covered this song live, and they do a fantastic job of it. They played it this year for their Meltdown appearance, and a version appears on their "B-Side and Rarities". My favourite take was from a 2004 appearance on MTV's "MTV Icon" programme honouring the Cure:
3. "Killing An Arab"
The provocatively-titled first single by the group takes its name from the pivotal scene in the Albert Camus novel "The Stranger" and is not actually endorsing the murder of people. Despite this it's an unfortunate title, and in recent years they have often performed it as "Killing Another". The song loses nothing in renaming, in my opinion.
The original recording is a sparse, tight, angular post-punk piece, but it's in the live setting that this song really shines, and it's a live version I recommend you try.
4. "Just Like Heaven"
It might be obvious that my tastes align more to the Cure's dark side than the light, but the light side can't be ignored. Most of their greatest hits and best known work are light, accessible pop classics. Choosing just one was amongst the hardest decisions to make. For the selection I offered my friend, I opted for "Friday I'm In Love", which is unabashed joy, but it didn't meet a warm reception, so I now substitute it for "Just Like Heaven".
Bonus video: someone proposed in the middle of this song!
5. "The Drowning Man"
From their "Very Dark" period, another literature-influenced track, this time Mervyn Peake's "Gormenghast": "The Drowning Man"
If you let the video run on, you'll get a bonus 6th track, similarly rarely performed live: Faith. I haven't seen either live yet. Maybe one day!
Since first writing about my archiving activities in 2012 I've been meaning to write an update on what I've been up to, but I haven't got around to it. This, however, is noteable enough to be worth writing about!
In the last few months I became chair of the Historic Computing Committee at Newcastle University. We are responsible for a huge collection of historic computing artefacts from the University's past, going back to the 1950s, which has been almost single-handedly assembled and curated over the course of decades by the late Roger Broughton, who did much of the work in his retirement.
Sadly, Roger died in 2016.
Recently there has been an upsurge of interest and support for our project, partly as a result of other volunteers stepping in and partly due to the School of Computing moving to a purpose-built building and celebrating its 60th birthday.
We've managed to secure some funding from various sources to purchase proper, museum-grade storage and display cabinets. Although portions of the collection have been exhibited for one-off events, including School open days, this will be the first time that a substantial portion of the collection will be on (semi-)permanent public display.
Things have been moving very quickly recently. I am very happy to announce that the initial public displays will be unveiled as part of the Great Exhibition of the North! Most of the details are still TBC, but if you are interested you can keep an eye on this A History Of Computing events page.
For more about the Historic Computing Committee, cs-history Special Interest Group and related stuff, you can follow the CS History SIG blog, which we will hopefully be updating more often going forward. For the Historic Computing Collection specifically, please see the The Roger Broughton Museum of Computing Artefacts.
This is part 2 in a series about a project to read/import a large collection of home-made optical media. Part 1 was Imaging DVD-Rs: Overview > and Step 1; the summary page for the whole project is imaging discs.
Last time we prepared for the import by gathering all our discs together and organising storage for them in two senses: real-world (i.e. spindles) and a more future-proof digital storage system for the data, in my case, a NAS. This time we're actually going to read some discs. I suggest doing a quick first pass over your collection to image all the trouble-free discs (and identify the ones that are going to be harder to read). We will return to the troublesome ones in a later part.
For reading home-made optical discs, you could simply use
cp /dev/sr0 disc-image.iso
This has the attraction of being a very simple solution but I don't recommend
it, because of a lack of options for error handling. Instead I recommend using
GNU ddrescue. It is designed to be fault
tolerant and retries bad sectors in various ways to try and coax every last
byte out of the medium. Crucially, a partially imported disc image can be
further added to by subsequent runs of
ddrescue, even on a separate computer.
For the first import, I recommend the suggested options from the
ddrescue -n -b2048 /dev/cdrom cdimage.iso cdimage.log
This will create a
cdimage.iso file, hopefully containing your data, and a
cdimage.log, describing what
ddrescue managed to achieve. You
should archive both!
This will either complete reasonably quickly (within one to two minutes), or will run potentially indefinitely. Once you've got a feel for how long a successful extraction takes, I'd recommend terminating any attempt that lasts much longer than that, and putting those discs to one side in a "needs attention" pile, to be re-attempted later. If
ddrescue does finish, it will tell you if it couldn't read any of the disc. If so, put that disc in the "needs attention" pile too.
Above, I wrote that I recommend this approach for home-made data discs. Broadly, I am assuming that such discs use a limited set of options and features available to disc authors: they'll either be single session, or multisession but you aren't interested in any files that are masked by later sessions; they won't be mixed mode (no Audio tracks); there won't be anything unusual or important stored in the disc metadata, title, or subcodes; etcetera.
This is not always the case for commercial discs, or audio CDs or video DVDs.
For those, you may wish to recover more information than is available to you
ddrescue. These aren't my focus right now, so I don't have much advice
on how to handle them, although I might in the future.
labelling and storing images
If your discs are labelled as poorly or inconsistently as mine, it might not be
obvious what filename to give each disc image. For my project I decided to append a new label to all imported discs, something like "blahX", where X is an incrementing number. So, for a fourth disc being imported with the label "my files", the image name would be
my_files.blah5.iso. If you are keeping the physical discs after importing them, You could also mark the disc with "blah5".
where are we now
You should now have a pile of discs that you have successfully imported, a corresponding collection of disc image files/ddrescue log file pairs, and possibly a pile of "needs attention" discs.
In future parts, we will look at how to explore what's actually on the discs we have imaged: how to handle partially read or corrupted disc images; how to map the files on a disc to the sectors you have read, to identify which files are corrupted; and how to try to coax successful reads out of troublesome discs.
Older posts are available on the all posts page.