We hear the term “unstructured data” often. It’s brought up as the enormous challenge of big data and often cited as the reason why traditional relational databases don’t meet the needs of Big Data. But that conversation doesn’t adequately describe the challenge organization’s face with unstructured data.
To get your head around unstructured data, you have to consider the history of data itself. When we first started digitizing our world in the 20th century, we first went after the low hanging fruit of transactional data…accounting. It was an quick win to transfer spreadsheets of information in neat columns and rows.
Decades later we’re digitizing everything in sight and sharing it across the enterprise, our partners and our personal connections. Despite everything that we’ve accomplished there is still an enormous amount of enterprise information that sits in text documents and presentations, graphics, email, audio, video, web pages and in various office software. Keep this in mind…it isn’t that unstructured data lacks any structure…it’s that unstructured data doesn’t fit the enterprise relational data model.
Even worse, much of our enterprise process exists as unstructured data, in the heads of workers and lacking any systematic approach for capture, management, communication, measurement and improvement. When the work activities themselves are unstructured, the day to day behavior of workers lacks cohesiveness and efficiency. But I digress. Let’s get back to data itself.
Why haven’t we fixed this?
What keeps us from successfully managing unstructured data? A few things:
- A lack of tools that easily manage unstructured data. Tools need to provide efficient text parsing and analytics, taxonomy and metadata management.
- Difficulty integrating unstructured data with existing information systems. The two are often seen as apples and oranges when it comes to analytics and decision making.
- Shortage of skills in existing staff
- Missing sense of urgency for managing unstructured data
Despite our best efforts to corral the unstructured beast, this kind of data continues to grow larger and presents a real problem for organizations that want to automate and improve their ability to understand their business, anticipate what’s coming and act quickly on risk and opportunity. There are certainly tools that are maturing and providing the beginnings of a solution. The challenge, however, will be in finding the urgency and getting our organizations to see the value of getting data out of its various hiding places and into a place that it can be used and valued.