Dealing with Big Data
The number and types of sensors and intelligence-gathering techniques are growing exponentially, collecting petabytes of new data intended to help us tackle big questions and make important decisions. The challenge is to glean meaningful insights out of all that data—and do it in a timely manner.
The key lies in increased automation, and at Harris, we are focused on three areas of process automation: workflow optimization, computer or “machine” learning, and data analytics. We believe that together, these can revolutionize a user’s ability to be agile in delivering effective, mission-critical direction.
The process of taking data points from their raw, collected state to useful information represents a system workflow. Using computers to automate portions of that workflow—like data filtering and sorting, extraction, and validation—speeds up the process and minimizes the potential for human error. For example, an analyst can spend hours manually culling through data files for cloud-free imagery in a subtropical region, where clouds dominate the skies. By developing automated metadata interrogation techniques, we can make a computer “look” through the same data to deliver the most recent cloud-free image in only seconds. And when you are dealing with big data, the computational power of several processors working in tandem can deliver results for multiple inquiries concurrently, saving even more time and money.
In fast-paced, highly competitive environments, the advantage will be in the hands of those able to ingest data from different sources and bring it into one common system. Combining data from a broad variety of intelligence sources—such as historical data and maintenance records, single-shot or motion imagery, three-dimensional point clouds, multispectral and hyperspectral reflectance, signals, or synthetic aperture radar—gives users the most comprehensive picture possible. Workflows that incorporate “sensor-agnostic” processing and open architectures will be able to meet multi-intelligence needs and adapt easily to the incorporation of new technologies.
INCORPORATING MACHINE LEARNING
Current practice in dealing with unstructured big data flows is to hire analysts to categorize and tag data with rich metadata labels so that computer processors know what to do with it. This is known as giving structure to unstructured content and when done manually, is a very labor-intensive and time-consuming task. Advancements in machine learning are making it possible for us to train computers to do this automatically, resulting in large volumes of metadata-rich content available for processing.
Similar to the way Google, Facebook, or Amazon can automatically identify objects or people in images, we can train computers to recognize and identify objects by certain attributes. Further, machine learning techniques involving deep learning can be applied to labeled training data sets to help jumpstart the learning cycle until live data sets are available. These features are then used to identify an object, a material, or a specific signal. With computer processors and artificial intelligence technology able to do the initial analysis for data tagging, analysts are freed up for tasks that demand higher cognition.
Harris is applying deep learning technology to detect weather conditions from traffic cameras, find and rate damage on wind turbine blades using drone motion imagery, detect planes, vehicles, and ships from aerial imagery, and identify railway obstructions from LiDAR data. Our applications automate the extraction of meaningful information from the unique data sets of our community, such as LiDAR and hyperspectral data; signals intelligence; and system performance, health, and status data.
REFINING RESULTS WITH DATA ANALYTICS
Computer algorithms find meaningful patterns that identify trends and relationships or sort and process data into more useful information pieces. Analytics achieve a set of solutions to answer a question, provide if-then scenarios, or generate alert notifications to help analysts key in on the best available information from the complete set of data from the various sources. By integrating algorithms with analytics, we can get desired results from big data sources more quickly.
Advanced systems capitalize on proven algorithms and incorporate commercially available and custom analytics to seamlessly bring data sets together, process the data into information, and identify the key information of interest. With automated processing, we can generate automated reports or customized alerts when prescribed thresholds are reached or identified, giving analysts a heads-up to investigate or take action.
There are a few general categories of analytics that help us pull progressively important information out of combined data sets. The first level of analytics, known as descriptive analytics, aim to identify what happened, like a change in the number of objects. The resulting information tells a story or identifies something that needs to be investigated from within large amounts of data.
The next level, diagnostic analytics, takes this understanding one step further by identifying why something happened, providing analysts with information they can use to make decisions.
Once the analytics in a domain are reliably providing descriptive and diagnostic results, we can move forward to furthering the value of decision-making information with predictive analytics. Here, the system begins to predict with a high level of confidence that something will happen based on combined mathematical and statistical calculations of expected outcomes with the specific data conditions.
Harris’ research and investments in the geospatial domain are focused on furthering machine learning capabilities to enhance the predictability of outcomes to achieve this level of offering predictive analytics and predictive systems maintenance. Predictive analytics have the potential to reduce system life-cycle cost and improve mission reliability by giving advance notice of looming component failures, which can then be managed to limit system downtime and loss of mission capability.
The pinnacle of the hierarchy of information value is prescriptive analytics. More difficult to achieve, but more valuable to possess, prescriptive analytics provide valuable insight into the steps that can be taken to prevent something from happening or cause it to occur. Harris is working in this area to fine-tune results that support daily systems maintenance.
BRINGING IT ALL TOGETHER
Through computer automation techniques, activities that once took days or weeks can now be completed in hours, enabling users to assign personnel to tasks that require higher levels of thinking. By capitalizing on advancements in technology and algorithm development, we are melding massive datasets into intelligence that matters for change detection, predicting what will happen next, and prescribing maintenance to prevent failures. Tools, like our Helios® weather analytics platform, turn raw data points into live road-weather condition alerts to support dynamic truck routing. With these tools, our domain expertise, and first-hand sensor knowledge, we can deliver innovation in the ever-changing environment of data channels to create unique, specialized intelligence.
Erik Arvesen is vice president and general manager of Harris’ Geospatial Solutions business unit. Harris provides a broad range of geospatial products, content management, advanced geospatial analytics, machine learning, and commercial geospatial solutions.