
Moving and storing it requires highly scalable and performant data storage
One of the biggest challenges facing most enterprise IT operations today is the management of a tsunami of unstructured data generated by an increasing number of applications and devices. Unstructured data is digital information that does not exist within a formalized data model (such as tagged with a markup language) or organizational structure (such as a database). Examples of unstructured data include raw textual data, sensor data, video content, and audio files. Chances are you’ve been working with unstructured data at some point today.
Unstructured data is now everywhere within the modern enterprise. Analysts estimate that unstructured data is roughly 80% of all managed data. Near and dear to Open Drives is the media, entertainment, and broadcast world, where full-feature-length movies are sometimes shot in 8K resolution, plus live TV and sports programming often captured with 4K video cameras. As a storage vendor, we’re always trying to find ways to optimize workflows leveraging such large-scale unstructured information.
But don’t just take our word for it. Here’s a thought along these lines from our partner Signiant:
“Futuresource estimates that 8K adoption will grow to form a significant part of the premium TV market by 2023. As fringe as it seems, we are still witnessing more high-end productions in 8K. Japan’s NHK launch the world’s first 8K TV channel in December of 2018. Some panel makers, especially in Asia, are already adding 8K resolution to their future roadmaps and product plans, with the first consumer models likely to appear in the next couple of years.”
Significant increases in compression technologies are counteracting the file size increases with unstructured data, but the sheer amount of these datasets is still staggering. An example below of uncompressed file sizes shows the exponential size increases of one hour of video between SD, HD, UHD, and 8K resolution:
SD | 86 GB |
1080P 60 ProRes | 224 GB |
4K RAW BMPC 24 | 742 GB |
8K RedCode RAW 75 | 7.29 TB |
Beyond the M&E markets, we can find myriad other use cases that are generating enormous amounts of data with huge files and datasets. Such use cases include genomic sequencing (~750 MB), digital pathology (varies widely from 15 GB per slide to over 3 TB), audio, traditional medical imaging modalities, an increasing number of higher definition digital video surveillance and drone cameras being deployed around the world, and millions of IoT sensors being used now in places from manufacturing to home appliances to autonomous vehicles.
The use and analysis of this data requires an exponential increase in compute power (to render, transcode, compress, or feed AI / ML compute engines), greater network bandwidth to transfer the data, and of course very large increases in storage system capabilities to retain, move, and backup these large files. Of course, much of this data is being stored in the cloud (albeit sometimes just temporarily) to take advantage of elastic compute resources to analyze the data and to create metadata that has additional significant value to the line of business to achieve their goals, whatever those may be.
OpenDrives is a data management software company at its core, with deep roots in the M&E industry. As such, we’ve invested our resources over the last decade in bringing to market high-performance, network-attached storage solutions (NAS) that can handle the live processing of these extremely large files and resource-intensive workflows within an open filesystem. Our Atlas Core software is a 128-bit filesystem (based on ZFS with extensive R&D optimizations) that supports file sizes up to 16 exabytes, directories that can support up to 256 trillion files, and a virtually unlimited number of objects and capacity within a single file system (really limited to just 256 quadrillion zettabytes). We have yet to see the workflow or unstructured dataset we can’t efficiently support while retaining our highly performant characteristics.
Object-based storage solutions are also extremely popular with customers today to provide the capabilities to store billions of files with a unique identifier and associated metadata bundled with the object itself. Many reading this blog will be very familiar with the capabilities and prevalence of AWS’s S3 object store standard.
OpenDrives today supports standard network protocols (NFS and SMB) plus the S3 standard, which means we can mount S3 buckets and present as a S3 target with no problem. These features give our customers the flexibility of storing and accessing their data via whatever protocol meets their specific data management and operational requirements. Plus, you can store unstructured data easily in a hybrid architecture combining on-premises and cloud storage targets – some data needs to be on-premises, while other data is more appropriate for off-premises storage in a private or public cloud service, depending on your unique technical requirements and of course budget constraints.
In the end, it’s all about serving up the right data at the right place at the right time, all at a performance metric that your end users require to generate business value each and every day. We would love the opportunity to discuss your unstructured data needs and how an OpenDrives storage solution can provide the ideal infrastructure for this challenging data type. Just drop us a quick message!