Conquering Big Data
BY JACKIE SCHMOLL
In the drive for actionable intelligence, the sheer quantity of data can be overwhelming. We know that valuable insights are hidden inside those massive data sets. How can analysts get to them more quickly?
Every minute, sensors of all types are collecting data to support national security missions. Unprecedented quantities of signals intelligence, measurement and signature intelligence, geospatial intelligence, image intelligence, and open source data are pouring into ground systems for processing and mission incorporation.
Estimates reveal that today’s imagery analysts spend 80 percent of their time searching for the right data to answer critical questions. This leaves only 20 percent of their time to do actual analysis. Adding more staff does not have to be the solution. As a premiere provider of content management tools and custom data solutions, Harris suggests four valuable tools that will make managing the big data influx easier, assure data quality, and enable analysts to get to real intelligence more efficiently and effectively.
Tool #1: Content Curation
Picture the museum curator dusting his prized artifact. He knows its value and proudly displays it for others to enjoy and learn from. Like the artifact, data has value and history. When we know where it came from and how it got to us, we can assign a level of confidence to it. Just as that prized artifact at the museum can be replaced with something even more spectacular, data can be replaced with something more current or useful.
Harris is modernizing data management with curation tools to monitor data’s value as it passes through time. Through curated data labeling, analysts can understand data’s origin and know what has been done to standardize it for insertion into big data storage systems. When the analyst knows the data’s origin and timeliness, there is less risk involved in applying it to missions, easing the task of providing actionable intelligence to the warfighter with better intelligence at faster speeds.
Tool #2: Data Virtualization
Virtual data systems connect and integrate disparate data silos without changing the original location of the data, reducing costs and risks associated with moving data from system to system. Data from the individual siloed systems is indexed, enabling the various dissimilar data sources to be accessed from a single point. This not only provides a convenient “window” into the possible data opportunities, but also ensures that the best, validated information is available and accessible to all users. Organizations gain a complete, more accurate picture of operations faster.
Whether systems are located on premise or in the cloud, virtual databases are proven to successfully bridge the gaps between multiple data systems. Harris is implementing secure virtual data systems as the way of the future to efficiently meet mission needs without extensive on-premise equipment.
Tool #3: Quality Assessment, Conflation, and Labeling
To effectively utilize data in big data environments, search, discovery, and retrieval tools aim for speed to identify query results. Of equal importance to speedy results is the quality of results. High-quality query results save time and reduce risk by providing fit-for-use results that enable faster analysis with increased confidence.
Search and discovery tools rely on metadata labeling that shares key information about what the data is. However, not all sources accurately label the data, and the labeling descriptions often are not consistent. At Harris, we resolve these shortcomings with tools that automate metadata labeling, greatly reducing time spent manually labeling. Our automated processes identify relationships or links between the attributes of the different data sets and apply rule-based, business-specific tagging and classification to standardize the metadata, which enables faster data discovery and retrieval for analysis.
We also reduce the time-consuming manual practices of data quality testing by employing scientific algorithms and automated workflows to keep accurate information, remove conflicting data, and provide trusted data of higher quality for the analyst. We employ tools to automate data set conflation, making it easier to accurately combine data from multiple system configurations.
Tool #4: Enterprise-wide Ground or Virtual Ground Processing
As more data comes in from the expanding number of sensor systems, ground processing tools must keep pace. Organizations managing multiple concurrent programs in silos incur higher overall operational and maintenance costs due to duplication in facilities, infrastructure, and personnel resources for the separate systems. We achieve efficiencies by enhancing enterprise-wide ground systems with shared processing, workflows, and mission-specific algorithms.
Comprehensive enterprise ground architectures integrate multiple programs and connect multiple mission command and control operators with a unified interface. The latest cloud-based satellite control and ingest tools, for example, are reducing ground station investment by enabling command and control and downlink ingestion for new satellite constellations.
While there will always be missions with unique products and needs, a single infrastructure that eliminates duplicate hardware and software and reduces the number of operations personnel pays off significantly in increased efficiencies and cost savings. Even more benefit comes when those enterprise-wide ground systems are designed to allow the rapid insertion of new sensor missions and incorporation of new algorithms that find meaningful patterns in data or sort and process data into more useful information pieces.
A Competitive Advantage
When it comes to getting real intelligence from big data, improving data management and streamlining ground processing functions can give organizations a competitive advantage. Tools like those discussed here enable fast, high-quality search results, so that analysts can focus their time where it counts most—analyzing data to put confident intelligence into the hands of the warfighter.
Jackie Schmoll is director of Government Geospatial Systems within Harris’ Geospatial Solutions business. Harris provides data processing, advanced analytics, and content management solutions that turn data into trusted information to detect, analyze, predict, and respond to events around the world.
Click here to view our entire publication on Mission Confidence: Increased resiliency, rapid response, managed risk.