The Defense Threat Reduction Agency is seeking innovative technologies to accomplish automated text processing and visual analytics for biological materials situational awareness in support of the Biological Materials Information Program (BMIP).
BMIP provides a dynamic compendium of information focusing on the identification and characterization of pathogen repositories worldwide. BMIP directly supports the warfighter by providing on-the-ground access to biological materials information for optimal situational awareness during targeting and exploitation operations.
Currently, the facility identification, research, and data population processes have been limited based on the reliance of manually extracting the relevant information from publicaly available data sources and populating each facility within the database. Analysts can spend days researching and collecting information on a single facility. To date, over 1750 pathogen repositories have been identified, but over half of those have not been fully characterized due to the limited resources available. These examples highlight a requirement to automatically collect, ingest, analyze, and summarize collected data to decrease the time required for ingesting and categorizing data while also increasing situational awareness related to biological material holdings worldwide.
Additionally, BMIP users have expanded since the initial concept was developed in 2013 to include Special Operations Command, the geographic combatant commands, the public health community, and foreign partners.
Recent advancements in text based natural language processing (NLP) and visual analytics offer the potential to automate the collection, perform link analysis, and summarize the presentation of data associated with biological events; thus, dramatically reducing the time required to identify and analyze new biological facilities and enabling a more rapid response for the warfighter.
Proposals should address potential solutions that integrate with the BMIP database and provide a common workspace for exploring all the relevant data sources. Proposed software products should use a services based architecture to perform automatic ingest of text data, invoke NLP tools, and enable flexible visualization of results.
Open source data sets should be leveraged for input into the BMIP database, including, but not limited to PubMed, ProMed, Web of Science, HealthMap, Google Scholar, ScienceDirect, and arXiv. Existing NLP tools should be leveraged to automatically ingest, perform link analysis, and summarize available information related to facility sites. NLP solutions should focus on pathogen repository characterization, including identification of unique funding sources, author affiliations, and facilities.
Finally, visualization techniques should be proposed to enable analyst navigation of these new results, displaying complex linkages (e.g. authors) while also summarizing derived statistics (e.g. pathogens per facility).
The first year of the effort should focus on developing a proof-of-concept for demonstration to DTRA. The ability to collect, analyze and present resulting data should be demonstrated. Following feedback from DTRA, the objectives for year 2 are development of user friendly software that can be deployed at or accessed from DTRA.
Further details are available via Solicitation Number: HQ0034-17-BAA-RIF-0001B. The current deadline for white paper submission is May 19, 2017 3:00 pm Eastern.