Projects / Searchlight

Technologies developed for the DARPA Searchlight program were evaluated on Merge testbeds and technologies.

This page contains information about papers and software in support of the Searchlight program.

The DARPA SEARCHLIGHT Dataset of Application Network Traffic

Calvin Ardi, Connor Aubry, Brian Kocoloski, Dave DeAngelis, Alefiya Hussain, Matt Troglia, and Stephen Schwab. 2022. The DARPA SEARCHLIGHT Dataset of Application Network Traffic. In Proceedings of the 15th Workshop on Cyber Security Experimentation and Test (CSET ‘22). Association for Computing Machinery, New York, NY, USA, 59–64. DOI: 10.1145/3546096.3546103.

Researchers are in constant need of reliable data to develop and evaluate AI/ML methods for networks and cybersecurity. While Internet measurements can provide realistic data, such datasets lack ground truth about application flows. We present a ∼ 750GB dataset that includes ∼ 2000 systematically conducted experiments and the resulting packet captures with video streaming, video teleconferencing, and cloud-based document editing applications. This curated and labeled dataset has bidirectional and encrypted traffic with complete ground truth that can be widely used for assessments and evaluation of AI/ML algorithms.

DOI PDF Data

@inproceedings{10.1145/3546096.3546103,
author    = {Ardi, Calvin and Aubry, Connor and Kocoloski, Brian and
    DeAngelis, Dave and Hussain, Alefiya and Troglia, Matt and Schwab,
    Stephen},
title     = {The DARPA SEARCHLIGHT Dataset of Application Network Traffic},
year      = 2022,
month     = aug,
isbn      = {9781450396844},
publisher = {Association for Computing Machinery},
address   = {New York, NY, USA},
url       = {https://doi.org/10.1145/3546096.3546103},
doi       = {10.1145/3546096.3546103},
abstract  = {Researchers are in constant need of reliable data to
    develop and evaluate AI/ML methods for networks and cybersecurity.
    While Internet measurements can provide realistic data, such
    datasets lack ground truth about application flows. We present a ∼
    750GB dataset that includes ∼ 2000 systematically conducted
    experiments and the resulting packet captures with video streaming,
    video teleconferencing, and cloud-based document editing
    applications. This curated and labeled dataset has bidirectional and
    encrypted traffic with complete ground truth that can be widely used
    for assessments and evaluation of AI/ML algorithms.},
booktitle = {Proceedings of the 15th Workshop on Cyber Security Experimentation and Test},
pages     = {59–64},
numpages  = {6},
keywords  = {datasets, network experimentation, network traffic},
location  = {Virtual, CA, USA},
series    = {CSET '22}
}

Generating Representative Video Teleconferencing Traffic

David DeAngelis, Alefiya Hussain, Brian Kocoloski, Calvin Ardi, and Stephen Schwab. 2022. Generating Representative Video Teleconferencing Traffic. In Cyber Security Experimentation and Test Workshop (CSET ‘22). Association for Computing Machinery, New York, NY, USA, 91–95. DOI: 10.1145/3546096.3546107.

Video teleconferencing (VTC) is a dominant network application, yet there is a dearth of tools to generate such traffic for systematic and reproducible experimentation. We present a framework to create representative video teleconferencing traffic and discuss our methodology for behavioral control of multiple bots to create human-like dialog coordination, including interactive talking and silence patterns. Our framework can be coupled with proprietary commercial VTC applications as well as deployed completely within a testbed environment to benchmark emerging networking technology and evaluate the next generation of traffic classification, quality of service (QoS) algorithms, and traffic engineering systems.

DOI PDF Code

@inproceedings{10.1145/3546096.3546107,
author = {DeAngelis, David and Hussain, Alefiya and Kocoloski, Brian and
    Ardi, Calvin and Schwab, Stephen},
title = {Generating Representative Video Teleconferencing Traffic},
year = {2022},
isbn = {9781450396844},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3546096.3546107},
doi = {10.1145/3546096.3546107},
abstract = {Video teleconferencing (VTC) is a dominant network
    application, yet there is a dearth of tools to generate such traffic
    for systematic and reproducible experimentation. We present a framework
    to create representative video teleconferencing traffic and discuss our
    methodology for behavioral control of multiple bots to create
    human-like dialog coordination, including interactive talking and
    silence patterns. Our framework can be coupled with proprietary
    commercial VTC applications as well as deployed completely within a
    testbed environment to benchmark emerging networking technology and
    evaluate the next generation of traffic classification, quality of
    service (QoS) algorithms, and traffic engineering systems.},
booktitle = {Cyber Security Experimentation and Test Workshop},
pages = {100–104},
numpages = {5},
keywords = {video teleconference, VoIP, network traffic generation,
    cybersecurity testbeds},
location = {Virtual, CA, USA},
series = {CSET 2022}
}

Case Studies in Experiment Design on a Minimega Based Network Emulation Testbed

Brian Kocoloski, Alefiya Hussain, Matthew Troglia, Calvin Ardi, Steven Cheng, Dave DeAngelis, Christopher Symonds, Michael Collins, Ryan Goodfellow, and Stephen Schwab. 2021. Case Studies in Experiment Design on a minimega Based Network Emulation Testbed. In Cyber Security Experimentation and Test Workshop (CSET ‘21). Association for Computing Machinery, New York, NY, USA, 83–90. DOI: 10.1145/3474718.3474730.

This paper describe our team’s experience using minimega, a network emulation system using node and network virtualization, to support evaluation of a set of networked and distributed systems for topology discovery, traffic classification and engineering in the DARPA Searchlight program. We present the methodology we developed to encode network and traffic definitions into an experiment description model, and how our tools compile this model onto the underlying minimega API. We then present three cases studies which demonstrate the ability of our EDM to support experiments with diverse network topologies, diverse traffic mixes, and networks with specialized layer-2 connectivity requirements. We conclude with the overall takeaways from using minimega to support our evaluation process.

DOI PDF Presentation Code

@inproceedings{10.1145/3474718.3474730,
author    = {Kocoloski, Brian and Hussain, Alefiya and Troglia, Matthew and
Ardi, Calvin and Cheng, Steven and DeAngelis, Dave and Symonds,
Christopher and Collins, Michael and Goodfellow, Ryan and Schwab,
Stephen},
title     = {Case Studies in Experiment Design on a Minimega Based Network
Emulation Testbed},
year      = 2021,
isbn      = {9781450390651},
publisher = {Association for Computing Machinery},
address   = {New York, NY, USA},
url       = {https://doi.org/10.1145/3474718.3474730},
doi       = {10.1145/3474718.3474730},
abstract  = {This paper describe our team’s experience using minimega, a
    network emulation system using node and network virtualization,
    to support evaluation of a set of networked and distributed
    systems for topology discovery, traffic classification and
    engineering in the DARPA Searchlight program. We present the
    methodology we developed to encode network and traffic
    definitions into an experiment description model, and how our
    tools compile this model onto the underlying minimega API. We
    then present three cases studies which demonstrate the ability
    of our EDM to support experiments with diverse network
    topologies, diverse traffic mixes, and networks with specialized
    layer-2 connectivity requirements. We conclude with the overall
    takeaways from using minimega to support our evaluation
    process.},
booktitle = {Cyber Security Experimentation and Test Workshop},
pages     = {83–90},
numpages  = {8},
location  = {Virtual, CA, USA},
series    = {CSET '21}
}

Building Reproducible Video Streaming Traffic Generators

Calvin Ardi, Alefiya Hussain, and Stephen Schwab. 2021. Building Reproducible Video Streaming Traffic Generators. In Cyber Security Experimentation and Test Workshop (CSET ‘21). Association for Computing Machinery, New York, NY, USA, 91–95. DOI: 10.1145/3474718.3474721.

Video streaming traffic dominates Internet traffic. However, there is a dearth of tools to generate such traffic on emulation-based testbeds. In this paper we present tools to create representative and reproducible video streaming traffic to evaluate the next generation of traffic classification, Quality of Service (QoS) algorithms and traffic engineering systems. We discuss 27 different combinations of streaming video traffic types in this preliminary work, and illustrate the diversity of network-level dynamics in these protocols.

DOI PDF Code

@inproceedings{10.1145/3474718.3474721,
author    = {Ardi, Calvin and Hussain, Alefiya and Schwab, Stephen},
title     = {Building Reproducible Video Streaming Traffic Generators},
year      = 2021,
month     = aug,
isbn      = {9781450390651},
publisher = {Association for Computing Machinery},
address   = {New York, NY, USA},
url       = {https://doi.org/10.1145/3474718.3474721},
doi       = {10.1145/3474718.3474721},
abstract  = {Video streaming traffic dominates Internet traffic.
    However, there is a dearth of tools to generate such traffic on
    emulation-based testbeds. In this paper we present tools to
    create representative and reproducible video streaming traffic
    to evaluate the next generation of traffic classification,
    Quality of Service (QoS) algorithms and traffic engineering
    systems. We discuss 27 different combinations of streaming video
    traffic types in this preliminary work, and illustrate the
    diversity of network-level dynamics in these protocols.},
booktitle = {Cyber Security Experimentation and Test Workshop},
pages     = {91–95},
numpages  = {5},
location  = {Virtual, CA, USA},
series    = {CSET '21}
}