Predictive Analytics: What Is It and How Can It Help the Federal Government?Nhung Mai
Powerful data analytics tools can help agencies save money and make more informed decisions.
Predictive analytics tools allow the government to get ahead of problems before they waste money, harm IT systems or cost lives. Such data analytics platforms can provide agency leaders, IT leaders, and analysts with actionable insights they can use to enhance their missions, improve their cybersecurity, save money on maintenance costs and generally make more informed decisions.
Agencies can also take advantage of open data to glean insights for and from one another, or open up data to the public and give them the opportunity to do the same.
“From spotting fraud to combatting the opioid epidemic, an ounce of prevention really is worth a pound of cure — especially in government,” Deloitte notes in a report on predictive analytics in government. “Predictive analytics is now being applied in a wide range of areas including defense, security, health care, and human services, among others.”
What Is Predictive Analytics?
For years, federal agencies employed traditional statistical analytics software (SAS) to build predictive models, but those workers were usually sequestered into back rooms without access to policymakers, notes Andrew Churchill, vice president of federal sales at analytics firm Qlik. “But now data science is in vogue and it’s the cool job,” he says.
The most basic way to understand predictive analytics is to ask, “How do I take what I can clearly see is happening and begin to, through trained models, describe what will happen based on the variables that we are feeding the machine?” Churchill says.
Mohan Rajagopalan, senior director of product management at Splunk, notes that predictive analytics involves the ability to aggregate data from a variety of sources and then predict future trends, behaviors and events based on that data. That can include identifying anomalies in data logs and predicting failures in data centers or machines on the agency’s network. It can also be used to forecast revenues, understand buying behaviors and predict demand for certain services.
“The outcome of predictive analytics is the prediction of future behaviors,” Rajagopalan says.
Adilson Jardim, area vice president for public sector sales engineering at Splunk, says that predictive analytics exists on a spectrum. On one end is basic statistical or mathematical models that can be used to predict trends, such as the average of a certain type of behavior. On the other end are more advanced forms of predictive analytics that involve the use of machine learning, in which data models are asked to infer different predictive capabilities, Jardim says.
Some customers are ingesting up to 5 petabytes of data per day, and that data can be used to not only understand what has happened but what could or is likely to happen, he says.
Predictive analytics can be applied across “a broad range of data domains,” Churchill says.
Defining the Predictive Analytics Process
There are numerous elements of the predictive analytics process, as Predictive Analytics Today notes. Here is a quick breakdown:
* Define project: Agencies first must define the scope of the analysis and what they hope to get out of it.
* Data collection: Getting the data itself and mining it can be a challenge, according to Rajagopalan. One of the big challenges federal agencies and other organizations face these days is the volume, variety, and velocity of data. “A model in the absence of trustworthy, validated and available data doesn’t yield much of a result,” Churchill adds.
* Data analysis: Another core element of the process involves algorithms that can inspect, clean, transform and analyze data to derive insights and make conclusions.
* Statistics: Predictive analytics tools need to then use statistical analysis to validate the assumptions and hypotheses and run them through statistical models.
* Modeling: Another key element is the modeling that is used to define how the data will be processed to automatically create accurate predictive models, Rajagopalan says. The algorithms can be as simple as rules that can be applied to understand a particular situation or understand data in the context of a particular scenario. There are also supervised algorithms and models that use machine learning techniques to build hypotheses around trends in the data and constantly refine themselves based on the data they are presented with.
* Deployment: IT leaders then have the outputs of the model, such as visualization, report or chart. The results of the predictive analysis are then given to decision-makers.
* Model monitoring: The models are continuously monitored to ensure they are providing the results that are expected.
Before, Rajagopalan says, agencies had specialized units to apply SAS, but those models were expensive to create. The democratization and consumerization of data and of analytics tools have made it easier to create simple and succinct summaries of data that visualize outputs.
What Is Open Data?
Joshua New, formerly a policy analyst at the Center for Data Innovation and now a technology policy executive at IBM, tells FedTech that open data is the best thought of as “machine-readable information that is freely available online in a nonproprietary format and has an open license, so anyone can use it for commercial or other use without attribution.”
On May 9, 2013, former President Barack Obama signed an executive order that made open and machine-readable data the new default for government information.
“Making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government,” the Obama administration noted.
On Jan. 14, 2019, the OPEN Government Data Act, as part of the Foundations for Evidence-Based Policymaking Act, became law. The OPEN Government Data Act makes data.gov a requirement in the statute, rather than a policy. It requires agencies to publish their information online as open data, using standardized, machine-readable data formats, with their metadata included in the data.gov catalog. May 2019 marks the 10th anniversary of data.gov, the federal government’s open data site.
The General Services Administration launched the site with a modest 47 data sets, but the site has grown to over 200,000 data sets from hundreds of data sources including federal agencies, states, counties, and cities. “Data.gov provides easy access to government datasets covering a wide range of topics — everything from weather, demographics, health, education, housing, and agriculture,” according to data.gov.
Predictive Analytics Examples in Government
Federal agencies are using predictive analytics for a wide range of use cases, including cybersecurity. Specifically, agencies are using these tools to predict insider threats, Splunk’s Jardim says. The models look at users’ backgrounds, where they have worked, how often they have logged in to networks at certain times and whether that behavior actually is anomalous. The goal of such tools is to make a good prediction of whether the security events should be tracked by human analysts, Jardim says.
“You only want to surface the events that are very clear insider threats,” he says. “The analyst is focused on high-probability events, not low-probability events.”
Predictive analytics can also be used for agencies’ data center maintenance by applying algorithms to look at compute capacity, how many users are accessing services and to assess throughput for mission-critical applications, Jardim says. Such tools can predict when a particular server will become overloaded and can help agencies preempt those events to ensure users have access to vital applications.
The Defense Department can also use predictive analytics to ensure that soldiers have enough of the right munitions and supplies in particular theaters of war and enough support logistics. “Logistics and operational maintenance take on a life-or-death consequence if I cannot ship enough munitions or vehicles into a specific theater,” Jardim says.
Qlik’s Churchill says that a customer within the Army is using predictive analytics tools to build models that support force enablement and predict the capabilities that will be needed in the future and which capabilities will be diminished, as well as the capabilities that will be required if certain scenarios arise.
The Pentagon is also working on predictive analytics tools for financial management via the Advanta workflow tool, which has brought together roughly 200 of the DOD’s enterprise business systems, Churchill says.
“How can they use predictive models to understand the propensity to have to de-obligate funds from a particular program in the future?” Churchill says. “As I am evaluating the formulation and execution of budgets, technologies like this have the ability to help those decision-makers identify the low-hanging fruit. How do I put those insights in front of people that they wouldn’t have gotten before?”
Predictive maintenance is also a key use case, especially for vehicles and other heavy equipment. Models can ingest data such as the weather and operating conditions of vehicles, not just how many hours they have been running, to determine when they will break down, Churchill says.