This page is a draft.

This is part 3 in a series about a project to read/import a large collection of home-made optical media. Part 2 was Imaging DVD-Rs: Initial import of the discs, the summary page for the whole project is imaging discs.

If you have managed to at least partially read data from the disc, you should hopefully be able to pull a list of files that are stored on the image.

There are several different tools you can use for this. I've summarized some of them and their advantages and disadvantages at ls-tools. In brief, I recommend using a "loopback mount", or in other words, the Linux Kernel:

sudo mount -o loop ./my_files.iso /mnt
cd /mnt
find . -ls

Collecting metadata

As well as the disc image itself, and the ddrescue log file, I recommend storing a list of the contents of the disc images and corresponding metadata. This means I can extract an ISO and move or rename files into my archive storage locations, but still retain information about what was on the discs.

What metadata is useful? That will depend on different factors. I tend to want to preserve the filename, file size, last modified date, and a checksum of the files contents. This means it's generally possible to relate a file back to an entry in the metadata even if it has been renamed; and to detect when a file has been modified.

summain (and the newer, faster, rewritten-in-Rust summainrs) can collect this information from a directory tree and write it in a structured "manifest" file that can be easily read from other tools. They can also be used to check whether a directory tree matches a manifest (by generating a new one, and comparing them).

Finding out what data is not extracted

If you have a partial disc image, it can sometimes be useful to know which files you have all the data for and which you do not, from which you can make a decision whether to continue trying to read the disc.

Sometimes, the damage is confined to areas of the image which are not actually occupied by files at all, and all the files can be extracted successfully.

My badiso tool reads in an ISO image from the command line (e.g. image.iso) and a corresponding ddrescue log file (image.log) and prints out a file listing, indicating complete files with a green tick and incomplete files with a red cross.


badiso is currently written in Python and builds on top of xorriso.