The Work

Let the Identifier Identify; Let the Metadata Describe

I have a sign above my desk:

Yesterday, I mentioned that in 1998, there were 900,000 books in print. In 2012 – fourteen years later – there are over 32 million.

This is a massive disruption. Not only are the established publishing houses churning out more titles year after year, there are lots of new companies starting up, or authors self-publishing. There are a lot of new entrants into the market. And things that veterans have long taken for granted – the ISBN, for example – are being called into question by these newcomers.

It constantly amazes me that this number can wreak so much havoc, but for the last several years I’ve managed to devote an hour a week, virtually every week (except for a several-month hiatus due to work), on Twitter to troubleshooting and mythbusting around the ISBN. Yes…#ISBNhour has been going on for years. I can’t believe it myself. But somewhere, to someone, the principles of this standard (which has been in existence since the 1970s!) is always still news.

And that’s just a single identifier! Granted, it’s the most basic, most fundamental number in our business – without this number, there generally isn’t much in the way of sales – but there are others! But the ISBN looms so large – and teaching people how to use it is so critical – that the other identifiers get short shrift.

I’ll hype those other numbers in a different post. But what I’ve seen over the years is that people tend to confused identifiers with metadata – and this is the primary trouble around the ISBN. And I want to untangle that – in a way that everyone can refer back to.

George Wright III, who runs a company called PiPS, is a member of BISG. And I’ll never forget one meeting where we were discussing ISBNs and ebooks, and the possibility of appending a suffix onto the ISBN to distinctly identify ebooks (yes, we already thought of that). George promptly interjected – in his inimitable gravely voice – “Let the identifier identify; let the metadata describe.”

In other words, the job of an identifier is to distinguish one thing from the next. “This thing is not that thing.” That is all an identifier does. It’s really so simple it’s almost unbelievable. But think about your social security number. It just tells the government you are not any one of the other 300 million people in the country. Or your driver’s license number – which tells the state that you are 158 256 789 and not 159 233 467. That’s all it tells anyone. The rest – your name, your address, your date of birth – is metadata. But none of that metadata is embedded in the driver’s license number. It’s just a number.

But sometimes, with identifiers, we ascribe meaning to them; we interpret them. Area codes are a good example – because these, too, are in the midst of great disruption now. In 1999, there were so many phone numbers in Manhattan that a new area code needed to be established. The borough was in an uproar – 212 was “prestigious” and meant the real Manhattan (and hence the real New York), but 646 was an upstart.

Now there are so many area codes all across the country that we’re constantly looking them up to find out where people are calling from. And even that doesn’t tell us everything – I have many acquaintances with cell phones from one part of the country but who have moved elsewhere and taken their phone numbers with them. (I myself have had the same cell phone number since 1998.) So our phone numbers don’t necessarily have anything to do with location anymore. Phone numbers are rapidly becoming dumb numbers – a string of digits that carries no intrinsic meaning. But in your contacts file – a database, in other words – that phone number is unique. It’s distinct. It allows you to build a record around it – where you can put the metadata about the person: name, email address, etc. The identifier identifies – it sets one thing apart from another; the metadata describes.

Now let’s go back to ISBNs. Those of us who’ve been in the business for decades have come to see ISBNs as “smart” numbers. There’s a prefix – 978 or 979 – which designates the product as being in the book supply chain. There’s another prefix – of varying length – that designates the publisher. There’s the identifier of the book itself, which is supposed to be a dumb number. And there’s a check digit, which is the result of a formula that ensures that the entire number is valid.

Here’s where we get in trouble: the publisher prefix. That bit, which comes after the 978 or 979, ultimately comes to be regarded as sort of a vanity license plate for a publisher. Just as desirable Manhattan phone numbers began with 212, so desirable ISBN prefixes began with 0385 (Doubleday).

But what happens when Random House buys Doubleday and eventually puts it out of business? What happens to all of Doubleday’s books – do they all now get Random House ISBNs? What happens to the backlog of unassigned ISBNs at Doubleday – do they evaporate?


Doubleday’s books – so long as they remain in print – continue with their existing ISBNs. And Doubleday’s outstanding ISBN pool – those that haven’t already been assigned to books – get merged with Random House’s. So, in essence, Random House has several publisher prefixes. You can’t tell one from another. And the more companies that Random House buys, the more prefixes it has available to use. If it sells off a division, those ISBNs become property of the purchasing publisher. In an age of 32 million ISBNs, and over half a million prefixes, the ISBN can no longer “mean” anything, any more than an area code does.

Which brings us to…the eISBN.

Just as a publisher prefix cannot “mean” anything anymore, the ISBN is not meant to describe the format of a book. Again, that’s the job of the metadata. The ISBN identifies any trade-able product in the book supply chain. The ISBN only says “this thing is not that thing”. The metadata describes what it is, what format it comes in, how long it is, how much it costs, and everything else.

Calendars are sold in the book supply chain. Calendars get ISBNs. They don’t get cISBNs.

There is no such thing as an eISBN. Ebooks get ISBNs. And those ISBNs mean nothing in themselves, except that this ebook is not that ebook. The metadata – which includes the format – describes what kind of book it is. Attempting to divine meaning from the ISBN as it applies to ebooks is only marginally more reliable than divining your future from the lines of your palm.

There are vendors who ask publishers for eISBNs. Don’t be confused. There is no such thing. They are asking for the ISBNs of your ebooks. (And those vendors should know better, and we are talking.) There are periodicals that publish reviews with eISBNs. Again, there is no such thing. They are publishing the ISBNs of the ebooks. (And these periodicals should know better, and we are talking.)

When Books in Print registers information about ebooks, it doesn’t discriminate. An ISBN is an ISBN is an ISBN – whether it belongs to an ebook, a print book, or a calendar.

And if you can’t stop saying “eISBN” for yourself, do it for the kittens.

3 thoughts on “Let the Identifier Identify; Let the Metadata Describe

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s