I talk a lot about identifiers. It’s my job. The esoteric identifiers – DOIs, ISNIs, ISTCs. The pragmatic ones – ISBNs. The other day I found myself in a meeting referring to URIs while the developers were talking about URLs (this is how you know you are either a geek or a purist jerk, or both – yeah, for 15 minutes I was “that guy”).
But outside of work, there are plenty of identifiers in our everyday lives – with varying degrees of “smartness” and “dumbness”. We’re quite comfortable with these, because we’ve grown up with them, and have to use them all the time, but when it comes to Big Data, they’re no different than any of the other numbers we talk about.
Social Security numbers are a good start. The first three numbers indicate the state where the SSN was assigned. The next two numbers are called “group numbers” – they group together the last four digits, which are issued sequentially. However! Some states were running out of numbers. So in 2011, the Social Security Administration began randomizing the assignment of numbers.
Phone numbers are another example of this. The first three numbers are the area code. The next three are the “exchange” – the local area of the caller. (Long ago, telephone exchanges were actually letters the caller would tell the operator, such as BUtterfield 8.) The last four numbers are randomly generated within the parameters of first the exchange and then the area code. However! Several phenomena have disrupted this system entirely. One is the rise of phone banks – the sheer number of telephone numbers that need to be assigned to these banks meant that new area codes had to be made up. The second is (or, rather, was) the fax machine. Having to assign a separate phone line to fax machines also meant that phone numbers were eaten up. The third, of course, is cell phones. This caused the greatest disruption of all – over time, people wanted to maintain their phone numbers regardless of where they lived. (My phone has an area code of 917, which used to mean Manhattan; it was assigned in 1997 when I lived in Brooklyn and worked in Manhattan – sixteen years later, I have maintained the same number even though I live on Staten Island and work in New Jersey.) Now phone numbers are essentially meaningless.
There are plenty of others – driver’s license numbers, passport numbers, license plates, EZ-Pass numbers, bar codes, numbers on shipping containers, Apple UUIDs. And with the Internet of Things, there will only be more. As they proliferate, and as our circumstances change, the prefixes of these numbers will have less and less meaning inherent in them. Which is not a bad thing – identifiers are best when they are dumb. All they mean to say, of course, is “this thing is not that thing“.