Talking About Books: ONIX

In 1999, a group of publishers, online booksellers, distributors, and data/content aggregators gathered around a table at the New York office of the Association of American Publishers. (It would be the first time representatives from both Amazon and Barnes & Noble were in the same room together, and the largest gathering to date of all participants in the book supply chain.) In discussions that occasionally got heated, the group acknowledged the challenges in getting book data from publisher to retailer (through a variety of channels) so that consumers could view it.

At the time, EDItEUR and AAP were developing something that they called Online Information Exchange – that meeting was the US book industry’s first exposure to what would become ONIX. Within a few months, the Book Industry Study Group had put together a committee (with liaisons from EDItEUR and AAP) to examine ONIX and determine the viability of its implementation in the US. This became the BISG Metadata Committee, which is now chaired by Richard Stark at Barnes & Noble (an original Muzer). ONIX is a global standard and the BISG Metadata Committee is the venue for American publishers to review the standard, recommend improvements, and troubleshoot implementation.

So what is ONIX, exactly? It’s an XML schema for communication information about products in the book supply chain. There are several series of standardized tags, as well as codelists denoting controlled vocabularies. Much of the work of the BISG Metadata Committee centers around the codelists – defining formats, contributor roles, determining what the term Page Count actually means. Entire 3-hour meetings have been dedicated to defining what a Pub Date is. (This has never actually been resolved to anyone’s satisfaction.) Where there are standards, there are compromises, arguments, and rat holes. Thirteen years after the first meeting about ONIX, the discussions can still sometimes get quite heated.

In the process, the term ONIX has come to be nearly synonymous with “book metadata”. Many, many publishers never view the DTD or create XML files – the metadata gets entered in spreadsheets, or by hand in online data-entry forms. In 2005, BISG began developing a set of best practices for publishers who were sending metadata, regardless of format.

ONIX has undergone several revisions. Most of the US industry is still on Version 2.1 (European publishers are moving to Version 3.0, which handles ebook metadata and other issues with more flexibility).

A Bit Of History: Building Babel

In 1995, I was working for a weird little company called Muze. Originally located in Williamsburg, Brooklyn, well before Williamsburg became synonymous with “hipster”, Muze was founded in 1990 by Trev Huxley, grandson of Aldous, and Paul Zullo, producer of the King Biscuit Flower Hour. It was originally a database of music that had been converted to CD; later, they created a video database as well. By the time I got there, they had just licensed Bowker’s Books in Print database and were amplifying that with synopses, reviews, and all manner of links and tags. The goal of all this was to install the data in kiosks in stores, so customers could easily browse for the products they wanted.

The team that created all this content spent two days a week at the New York Public Library, researching connections between authors and books. We created Schools and Movements, Themes and Genres. We created sprawling taxonomies of time periods and locations. We mapped Bowker subject headings to BISAC categories, created a central Canon of authors whose works we would prioritize. We transcribed the endorsements on the back jackets of books. We entered flap copy into the database, wrote our own annotations when moved to do so. We sent stacks and stacks of faxes to Bowker, correcting their data.

And suddenly there was Amazon.

It was sudden, almost overnight. The book world was upended. Amazon had licensed data from Baker & Taylor, not us (nor Bowker), and while the Muze data was more intricate, the Amazon data was more visible. Many of us, over the next three years, went to work for Barnes & Noble.com to help B&N attempt to duplicate Amazon’s success – I can’t speak for the others on the team, but in my case it was a matter of finding work that was secure. Barnes & Noble would never go out of business, and if this whole World Wide Web thing didn’t pan out, I could at least work in the back office of the bookstore chain.

By 1998, it was becoming apparent that Amazon had caught the book industry – and, to an extent, even itself – flat-footed in one regard: information about books. Consumers could see it. Authors (and their mothers) complained. Publishers complained. Agents complained. The titles were truncated. The prices were wrong. The annotations – such as they were – were either far too brief to tell what the book was about, or filled with HTML-unfriendly characters. Data at online stores got updated erratically – depending on how many people were working that day, who was chosen for Oprah’s Book Club, what other promotions were being held. Publishers clearly had never expected anyone outside the book industry to look into their databases…and it showed. Rather than flipping through a beautifully-illustrated, curated Spring or Fall catalog, consumers were confronted with what sometimes looked like gibberish.

It became clear that internet bookstores were not going away, and the industry needed a standard for the information.


ISBN pricing at Bowker

This topic just arose on Twitter and I thought it would be useful to address two questions: (1) the cost of a single ISBN vs packages (2) the cost of ISBNs in the US vs overseas.

First, why a single ISBN costs $125, while packages of 10 or more are cheaper. Bowker doesn’t encourage the purchase of a single ISBN, and that’s reflected in the pricing. The truth is, most books are published in multiple formats these days – ebook, hardcover, paperback. And each one needs to be separately identified in the book supply chain – which requires each separately tradable product to be assigned an ISBN. So anyone purchasing a single ISBN is more inclined to re-use it for another product or format of a product – which really causes headaches to the supply chain, as it overwrites historical data and confuses retailers. We price packages of ISBNs at a lower per-ISBN price because we want to actively discourage the confusion that results in someone re-using an ISBN. However, we also recognize that occasionally, a single ISBN purchase is warranted, so we do allow for it.

It’s interesting to note that in the UK, purchase of a single ISBN is not even possible. Initial packages start with 10 ISBNs.

Second, why the cost of ISBNs in the US is higher than it is in other countries. Outside the US (and the UK, and a few other countries), ISBNs are issued by organizations with governmental support – national libraries, for example. In the US, ISBNs are issued by a private company (Bowker/ProQuest, where I am an employee); in the UK, ISBNs are issued by Nielsen. So in some countries, ISBNs are essentially subsidized by taxes. Unfortunately that is not the case in the US. You can find the appropriate person to email about that here.

I hope this helps explain things.


Beams Not Falling

“Flitcraft adjusted himself to beams falling, and then no more of them fell, and he adjusted himself to them not falling.”

The story of Flitcraft is probably the strangest interlude in The Maltese Falcon (a book that has some very strange interludes, to be sure). It’s a tale within a tale – Sam Spade has a moment alone with Brigid O’Shaughnessy and he tells her what amounts to a parable about an old case of his.

Essentially, Flitcraft was a very ordinary man living a very ordinary life, who one day just up and disappeared, “like a fist when you open your hand”. Flitcraft’s wife hired Sam to track him down. When he finally found Flitcraft, he found a man who was…a very ordinary man living a very ordinary life – a virtual duplicate of the life he had left behind. Sam asked him, “Why did you do that? Why did you leave everything you have, and then recreate it somewhere else?”

And Flitcraft told Sam about how he was walking past a construction site the morning of his disappearance. A girder beam fell from the site and nearly killed him – missed him by an inch. This close brush with death terrified him, and he bolted. Gradually, it dawned on Flitcraft that this was an isolated, bizarre incident and he resumed his ordinary life. By fleeing, he adjusted himself to the falling of beams, to the fact that life very well could end at any moment. But beams can’t fall forever from the sky – we are not in turmoil all the time. When he’d reached emotional equilibrium, he found himself adjusting to the beams not falling. To the everyday.

We generally return to a state of beams not falling. And as we do this, we are (like Flitcraft) in a different place than we were initially. The falling of beams causes movement, causes adjustment – and then things begin to seek their own levels.

I’m thinking about ebook pricing.

Hitting the Books

So – in addition to “Book: A Futurist’s Manifesto“, I’ve got some contributions about metadata and identifiers in the Frankfurt Book Fair ebook. Apparently identifiers will achieve near ubiquity at Frankfurt – if you see a piece in the show daily about ISNIs and ISTCs, there’s a 50% chance it will be mine (Olav Stokkmo, of IFRRO, also has a piece on the two identifiers). And Publishing Perspectives will be running a piece about ISNI as well.

It’s all ISNI all the time around these parts. There’s some exciting stuff coming out of Wikipedia and Google in the coming months. I’ve got some delicious pilots set up for 2013. All that vagueness of previous posts is coalescing into some concrete action.

Essentially, it can be boiled down thus: Identifiers are like a neon sign to search engines. They point out that certain information is authoritative, and also unique. Search engines look for content that is authoritative and unique, and rank that content higher in search results.

Here’s why identifiers designate authoritative and unique things: People have to pay for them. If you care enough about your content to pay a registration fee and enter all the metadata that the registration agency wants, you obviously have something worth looking at. People just don’t buy identifiers and fill out online forms for the hell of it. Search algorithms know this. They prioritize identified content. If your content is cared-for and tended-to, that will get recognized.



Tights! I love the first day of tights in fall – I think I rushed it a little, as it’s supposed to hit 77 today, but I don’t care. I’m ready.

We spent the weekend in the final throes of summer – a pool party with several families, one of which contained three little girls (ages 4,5, and 6) who roamed about the yard and house in a small, silent pink herd – quietly rearranging the lawn furniture, surreptitiously feeding the dog banana yogurt, making pilgrimages to the car in which they arrived to fetch backpacks and baby dolls and a large broken umbrella. They left as quietly and pink-ly as they arrived.

Scamp and I prepared for fall knitting – cleaning out our yarn baskets and revisiting half-finished projects. Bernardo fertilized the lawn. We watched parts of a couple of football games – and found ourselves re-watching Remember the Titans. Bernardo roasted some chicken with butternut squash (marjoram, garlic – very savory against the sweet backdrop of the squash). There were apples.

We’re moving inexorably towards the closing of the swimming pool. It will be too chilly to eat outside. Gradually and then suddenly we’ll move indoors – at first feeling grateful and energetic, later feeling cooped-up and frustrated. It will snow. There will be needles from the Christmas tree all over the floor. We will fantasize about golf.

But for now, we’re bustling. The Rhinebeck Sheep and Wool Festival awaits.

Oh, Happy Day!

Today I got my copy of this: Book: A Futurist’s Manifesto. What a great project this was – I absolutely loved being included in it. And it really IS a manifesto, which I love even more – those of us battling the same-old-same-old NEED a manifesto.

Another thing that makes me happy is the workshop that Brian O’Leary and I are giving in Frankfurt. Yes, it’s about metadata – it’s a deep dive into what works and what doesn’t, how to refine even the best practices to describe books more effectively, and…(my favorite) what’s next. Semantic web! Identifiers! Linked data! XML! Context and containers! The space between things! Nodes! Relationships! How the web works!

Brian has a great description on his blog about it. I think it’s going to be pretty groundbreaking. Kinda…futurist.


In Paris

I am in Paris for an ISNI Board meeting. Except it’s not quite Paris – it’s just outside in Place de la Defense. I am at the Hilton (cue Paris Hilton jokes), directly across from two shopping malls.

This area is dedicated to commerce. It’s all glass and concrete. This is the view from my hotel room window.


And here are some shots of the neighborhood:






The point of pride here is this square arch:


Around the bend is a sculpture of a ginormous thumb, which I can’t help but feel is a monument to swiping your iPhone:


