Micro Focus is now part of OpenText. Learn more >

You are here

You are here

5 Best Practices for Optimizing Your Unstructured Data

Krishna Subramanian Co-founder, President and COO, Komprise
Photo by Reid Naaykens on Unsplash

When it comes to unstructured data, businesses face something of a paradox. On the one hand, they have more unstructured data—and, by extension, more potential for gaining critical insights from data—than ever before. On the other hand, IT leaders report a variety of challenges in putting all of their unstructured data to good use. Problems like figuring out how to move unstructured data without disrupting users, poor visibility into unstructured data, and legal constraints are all common barriers to optimizing the management of unstructured data, according to a recent Komprise survey.

To overcome those hurdles, IT organizations need ways of deriving value from unstructured data while simultaneously addressing priorities like securing the data, reining in data storage costs, and future-proofing data against the business needs of tomorrow. It's possible to square this circle, but only with the right approach to unstructured-data management.

To provide actionable guidance, this article walks through five best practices for maximizing the value of unstructured data—meaning any type of data that doesn't originate in a database, spreadsheet, or other structured data format. As you'll learn, no matter how much unstructured data your business has to work with, it's possible to turn that data into value while also meeting security, cost-management, and flexibility requirements.

1. Don't Fly Blind with Unstructured Data

Effective management of unstructured data starts with knowing your data and understanding the core metrics surrounding your data. To get started, you'll need to establish visibility into such things as:

  • How much data you have,
  • How old your data is,
  • Where the data is stored,
  • What types of information the data consists of,
  • What the file types and file sizes of your data are,
  • Who owns the data,
  • Who can access the data,
  • What the access patterns look like, and
  • What it costs to store the data.

This visibility is critical because, in most cases, unstructured data is born inside silos. Each department in your business likely stores its own sets of documents, video, audio, application data (for instance, genomics, medical images, or self-driving car data), reports, and so on. In many cases, that data may not even be centralized within departments—let alone across the business as a whole. And unless you know which unstructured data you have, you can't make informed decisions about how best to manage it.

2. Plan for Ongoing Data Mobility

Businesses have a tendency to treat data migration as an infrequent, periodic event. When they plan to migrate data from on-prem into the cloud, for example, they might assume that migration ends once data has been moved to the cloud.

The reality of data lifecycles is more complex. In many cases, unstructured data is constantly in motion. After you move it to the cloud, you're likely to move it to different storage tiers within the cloud, or from one type of cloud service (like object storage) to another (like a data-analytics platform).

For this reason, IT leaders need a systematic way to manage data movement on a continuous basis. They should treat cloud data migration as an ongoing process, and they should support it with policy-based automation wherever possible. That's the only way to ensure that data is always living in the right place as it moves through the lifecycle from active use to cold storage or archives—and then, sometimes, back again to active use.

3. Continuously Add Value to Unstructured Data

IT leaders are already thinking to a certain extent about how to add value to unstructured data. For instance, according to Komprise's survey, 65 percent of organizations seek to deliver unstructured data to big-data analytics platforms in order to derive value from it.

That's one way to add value to unstructured data. But smart IT leaders are thinking more comprehensively and holistically about getting the most out of their data. They're indexing unstructured data as part of their data-migration and consolidation processes, so that the data becomes easier to find, search, and use. And they're using the cloud not just as a low-cost storage solution, but as a way to build a data lake where they can easily leverage cloud compute services to drive analytics for their data.

The point here is that IT leaders should be constantly on the lookout for ways to make unstructured data easier for everyone within the business to use. Big-data analytics are part of that equation, but they're certainly not the only component.

4. Enable Secure Self-Service for Your Data

Along similar lines, empowering business users with self-service access to unstructured data should be a priority for IT leaders.

The reason for this is that neither moving data into the cloud nor creating a data lake are enough on their own to guarantee that data has real business value. To achieve that, users must be able to find the data easily and integrate it into their workflows using seamless self-service processes.

Systematically tagging unstructured data is the key to enabling self-service. When data is well-labeled, users across the business can easily search for and find the documents, photos, videos, and other types of information they need—no matter how many data assets the business owns, and no matter its organizational structure. Of course, the search and access mechanisms need to enforce security and access control so that each user only sees the data they are authorized to access.

5. Embrace Standards-Based Data Management

Your data is yours. Don't let vendors control where you can store it and what you can do with it.

Instead, choose unstructured-data management tools that are standards-based. This ensures that you can move data across any platform or use any type of data service that is also standards-based, without depending on a specific vendor to enable that functionality.

Standards-based management of unstructured data is particularly important given that the world is constantly evolving. Even if you're happy with the data platforms and tools you use today, you may not be happy tomorrow. Standards-based tooling ensures that you never end up stuck because you can't migrate data.

On top of this, standards-based tools help ensure that businesses can do whatever they need to do with their data without paying licensing penalties and costs, such as for a third-party cloud filesystem or unnecessary cloud-egress fees. By using data-management solutions that store data in native format in each tier, you can directly access the data and use all of the cloud data services on your data without having to pay a data-management or storage vendor. Avoiding these costs is a priority for 42 percent of IT leaders surveyed by Komprise.

Embracing the Unstructured-Data Challenge

The amount of unstructured data that businesses have to manage will grow steadily for the foreseeable future. According to Komprise's survey, 87 percent of IT leaders want to manage unstructured data effectively as it continues to grow. Rather than viewing unstructured data as a liability or a challenge, IT leaders should look for ways to derive more value from it.

Doing so starts by understanding the data and implementing automated data management. From there, businesses can build data enrichment into their management processes, offer self-service access to data, and embrace standards-based operations to get the most out of the unstructured data they generate.

Keep learning

Read more articles about: Enterprise ITData Management