research statement

Research Statement

Research Overview

As a computational scientist, I develop adaptive and dynamic computing methods for remote sensing, robotics, and machine learning. Traditional static algorithms are engineered to handle all anticipated scenarios by exhaustively exploring a large parameter space during training and development, which results in unnecessary computational complexity during inference. In contrast, many real-world scenarios require only a small, context-specific subset of computations. My research leverages this observation to design modular, self-adaptive algorithms that dynamically activate only the computations needed for the encountered scenario. This capability is critical for remote sensing and robotics, where systems operate under strict constraints in terms of onboard computing, energy, power, and reaction time. My research contributions enable algorithms to perceive, interpret, and respond to environmental cues as to tailor their computational pathways in real time. Primary innovations include:

Algorithm frameworks for dynamic computing, which restructure traditional pipelines to support selective execution.
Adaptive models utilizing perceived context, capable of identifying what computations are required based on environmental, data, and mission factors.
Deployment methodologies for complex environments, where information may be sparse, noisy, or multimodal.

Because these methods are at the intersection of computation and domain science, my work is highly interdisciplinary. I collaborate closely with experts across robotics, hydrology, geology, and astronomy to co-design projects and solutions. My research experiences have yielded eight first-author publications, multiple awards, invited talks, and open-sourced software and data.

Algorithm Frameworks for Dynamic Computing

Static algorithms can be expressed as functions $f(X)$ , where the computation is fixed for any input instance $X$ . In contrast, dynamic computing reformulates the algorithm as $f(X,G)$ , where a gate variable $G$ determines which computational substructures are activated, while producing a similar output. This design enables substantial flexibility: different substructures correspond to different computational pathways that can be chosen based on context. However, when the number of available substructures is large, they cannot all be loaded into memory simultaneously. Real-time applications thus face a bottleneck due to context switching, where one substructure must be unloaded and another loaded, incurring significant overhead.

To address this issue, I propose a hierarchical composition of substructures that eliminates the need for expensive context switches. In my framework, the algorithm is organized into multiple cumulative levels: each higher-level substructure contains all components of the levels below it. Because all levels are preloaded prior to runtime and higher levels overlap with lower ones, switching to a different computational pathway incurs no additional memory or loading cost. As a result, the runtime overhead is bounded by the cost of executing the largest substructure. A key research challenge is designing $f(X,G)$ so that it can be executed under this hierarchical structure. I present both domain-specific and general solutions. Each approach assumes that consecutive inputs $X^t$ and $X^{t+1}$ are parametrically correlated, which is a realistic assumption.

My domain-specific method introduces a two-level hierarchical structure designed for real-time drone navigation. The first level performs the majority of the computations using a Deep Neural Network (DNN) to infer depth information from a monocular RGB image at time $t$ . The second level uses a lightweight custom algorithm to extrapolate this depth information across future frames $t+1$ through $t+T$ , thereby avoiding frequent expensive DNN inferences. I present an application of this method in [1], in which small drones typically rely on monocular cameras due to size, weight, and power constraints, yet monocular images lack native depth information. While monocular depth estimation with a DNN is accurate, it is computationally expensive for resource constrained drones. My hierarchical pipeline resolves this issue by activating the DNN only when needed and using the low-cost extrapolation algorithm otherwise. This approach yields substantial performance gains: 19% reduction in inference time, 26% reduction in power consumption, and 20% reduction in total energy expenditure. Increasing the extrapolation horizon $T$ can further reduce computational overhead, at the cost of a degradation in accuracy. By decreasing inference latency, the drone reacts more quickly to environmental changes; by lowering power and energy usage, more onboard operations can be executed in parallel and the time before the drone needs to be recharged is improved.

My general solution augments preexisting architectures in literature, embedded dynamic neural networks [2] that can change their structure at runtime (during inference). Specific variants scale neural operations at runtime in terms of the depth with early exits, the width with slimmable networks, or other sub-structures with more novel approaches. In either case, the substructures are nested in a way that aligns naturally with my proposed hierarchical framework. Moreover, because any computation can theoretically be expressed as a neural operation, dynamic neural networks provide a generalizable platform for dynamic computing across various domains. However, the literature has lacked a method to determine which nested substructure to select for a given situation. This missing component is critical for making dynamic neural networks genuinely adaptive. I propose a novel mechanism for determining the appropriate substructures based on perceived context, as described in the next section.

Adaptive Models Utilizing Perceived Context

In my thesis work, I introduce a novel method for augmenting dynamic algorithms with self-adaptive capabilities, enabling them to autonomously select the optimal computational substructures in response to a given scenario. In principle, the intermediate variables (latent features) produced during execution of $f(X,G)$ contain rich contextual information that could be used to determine the most appropriate substructure. This creates an apparent paradox: computing these latent features requires executing $f(X,G)$ itself, which defeats the purpose of adaptive computation, since the substructures must be chosen before execution.

I resolve this paradox by leveraging the assumption that consecutive inputs are parametrically correlated in time. Specifically, under the assumption that $X^t$ and $X^{t+1}$ are correlated, the intermediate computations executed at time $t$ can be used to predict the optimal substructures for time $t+1$ . This enables the utilization of intermediate features into the adaptive logic while reducing the number of redundant computations. Formally, I define the optimization problem:

<math data-latex=”argmin_\pi <E> \; \text{s.t.} \; argminπ<E>s.t.<η>≥η0argmin_\pi <E> \; \text{s.t.} \; <\eta> \; \geq \; \eta_0

Where is a learnable policy that is used to control the flow of data processing, computations, and other operations within the dynamic system, and realizes the gate control variable $G$ that activates substructures during execution of $f(X,G)$ at each time $t$ . The objective, <math data-latex=”<E><E>, represents the expected energy consumed during execution of $f(X,G)$ over a given episode (i.e., from $t_0$ to $t_f$ ), or a distribution of episodes. The constraint enforces that the expected task accuracy, <math data-latex=”<η><\eta>, which is correlated to $\pi$ , does not degrade beyond a predefined threshold $\eta_0$ . While real-world applications sometimes demonstrate that dynamic policies can outperform static ones, a reduction in accuracy is more typically expected, and this constraint prevents convergence to trivial, overly inexpensive solutions.

Designing is a challenging task. I therefore employ (deep) reinforcement learning, because the combinatorial space of substructure sequences across time, data modalities, and scenarios is often intractable to exhaustively search. However, training an end-to-end policy with reinforcement learning is difficult, because the structure of the policy may change during training, resulting in unstable gradients and improper convergence. To address this challenge, I decouple the training procedure into two stages. In the first stage, a portion of the policy is trained using supervised learning to operate as a dynamic algorithm, such that different substructures can be activated during inference. In the second stage, the remaining portion of the policy is trained using reinforcement learning as a static algorithm, which learns to select among the available substructures within the dynamic component while simultaneously fulfilling downstream task objectives. This two-stage training strategy stabilizes learning while preserving the flexibility necessary for effective online adaptation.

By applying my framework to slimmable neural networks, I augmented them to be self-adaptive by autonomously selecting an optimal slimming factor, $\rho$ , as a function of perceived context. The slimming factor, $\rho \in (0, 1]$ , determines the percentage of nodes activated in each layer of a slimmable network, such that $\rho$ uses a strict subset of the nodes used at any factor $\rho’ \geq \rho$ .

I first applied this approach to autonomous drone navigation, where task accuracy, $\eta$ , was defined as the percentage of episodes that successfully reached a variable target location. The navigation policy took input depth maps from both a forward and downward facing depth sensor. While neural networks have been shown to be effective for autonomous navigation, their computational demands clash with the constraints of small, resource-limited drones. By integrating my adaptive computing methods into the navigation pipeline: (1) the number of computations was reduced to 57-92% of the original static model, and (2) the resolutions of the depth sensors were scaled down to 61-80% of their full capacity [3].

The second result highlights a unique extension of my proposed framework: it can be used not only to adapt computations, but also to dynamically control sensor modalities. In this case, the input dimensionality of the navigation network directly controlled the resolution of scalable LiDARs, consequently controlling their power consumption. This reduction in resource consumption further amplifies the energy savings from adaptive computations.

I demonstrate a second application of my adaptive computing framework in the context of wireless communication for edge computing, specifically split computing. In this scenario, a drone captures monocular RGB images and utilizes a deep neural network to estimate depth for downstream navigation. Rather than transmitting raw sensor data, the system employs supervised compression via split computing: an initial portion of the DNN is executed onboard the drone, and the resulting latent representation is compressed and transmitted to a nearby server, which completes the remaining computations.

Previous split computing approaches typically rely on a fixed data rate, which can be suboptimal because the compressibility of latent representations varies significantly across scenarios. By integrating my adaptive policy into this pipeline, the transmission rate is adjusted in response to context, resulting in a reduction in data rate of up to 95% of that used by a fixed data rate [4]. This corresponds to fewer active network parameters on the drone, yielding lower power consumption and faster inference, while the reduction in transmitted data leads to lower communication latency and improved system responsiveness.

Deploying to Complex Environments

I work with real-world data that is collected at various times, geographic locations, spatial resolutions, and sensing modalities. These datasets differ not only in their structure, but also in dimensionality and quality, posing significant challenges in developing cohesive data processing pipelines. One application of this work is in autonomous mineral and rock classification for lunar and Martian rovers. Such rovers are equipped with scientific instruments including heterogeneous spectrometers and imagers, and typically transmit raw data to an orbiting satellite, which then relays it to scientists on Earth for analysis and further instructions. This process is exposed to severe communication latency and significant resource overhead. Enabling greater autonomy onboard the rover would allow for faster and more efficient in-situ decision making, data processing, and science.

To address this, I developed a novel training procedure for multimodal neural networks that explicitly learns both unimodal and cross-modal features. Each sensor modality is assigned a unique stem of a larger neural network, and these stems are first trained independently using supervised learning. In a second phase, the stems are merged and jointly optimized to learn complementary, multimodal representations. The resulting architecture was used to classify rocks and minerals typically encountered on Mars and the Moon, yielding accuracy rates up to 100% for certain mineral types [5]. Beyond its immediate application to planetary science, this work demonstrates a generalizable method for improving feature learning in multimodal deep learning systems, with implications for future autonomous space exploration missions.

A related challenge in real-world data is how to mitigate inherent noise and missing values. While multimodal sensing can partially mitigate this issue by providing redundant sources of information, I showed that performance can be improved further through data repair. Specifically, I employed a Denoising Autoencoder (DAE) that leverages the covariance structure among heterogeneous sensors to reconstruct missing values and reduce noise. I applied this approach to the problem of predicting Evapotranspiration (ET) from real-time satellite imagery, meteorological forcing data, weather towers, and field sensors. Existing approaches typically rely on interpolation methods followed by rigid physics-based calculations. In contrast, I developed a flexible DNN framework capable of operating directly on noisy, incomplete data by integrating a DAE into the pipeline. The DAE reduced artificially induced corruption by 47–94% [6], and when its learned latent representations were used as input to a downstream model, ET prediction improved from an R2 value of 0.69 to 0.72. These improvements have important implications for agricultural monitoring and sustainable water-resource management.

I further advanced this ET prediction framework by integrating Monte Carlo (MC) sampling methods into the inference pipeline for sensitivity analysis and uncertainty quantification. Classical DNNs produce a single deterministic output, which obscures information about predictive uncertainty. To address this, I introduced random perturbations to the input data to evaluate sensitivity to predictors, and applied random dropout during inference to estimate uncertainty related to model parameters. This produced a distribution of predictions, allowing for an assessment of confidence, variance, and dominant modes. Such information is especially valuable in autonomous surveying systems, where regions of high uncertainty or sensitivity can be targeted for further data collection or more careful analysis.

Future Work

My current projects, while finishing my thesis, are: (1) developing Visual-Language-Action (VLA) pipelines for autonomous drones and rovers, and augmenting them with my dynamic and adaptative frameworks; and (2) developing robust sim-to-real applications that train autonomous drone and rover policies in a simulator and deploying them to real world vehicles.

References

[1] Yang, Mengting and Johnsen, Timothy K, et al. “SmartDepth: Motion-Aware Depth Prediction with Intelligent Computing for Navigation.” 21st International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT). IEEE, 2025.

[2] Johnsen, Timothy K., Ian Harshbarger, and Marco Levorato. “An Overview of Adaptive Dynamic Deep Neural Networks via Slimmable and Gated Architectures.” 2024 15th International Conference on Information and Communication Technology Convergence (ICTC). IEEE, 2024.

[3] Johnsen, Timothy K., and Marco Levorato. “NaviSlim: Adaptive Context-Aware Navigation and Sensing via Dynamic Slimmable Networks.” Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI). IEEE, 2024.

[4] Johnsen, Timothy K., et al. “Navisplit: Dynamic multi-branch split dnns for efficient distributed autonomous navigation.” 25th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM). IEEE, 2024.

[5] Johnsen, Timothy K., and Virginia C. Gulick. “Single-and multi-mineral classification using dual-band Raman spectroscopy for planetary surface missions.” American Mineralogist 110.5 (2025): 685-698.

[6] Johnsen, Timothy K and Bi, Xiangyu, et al. “Denoising autoencoder for reconstructing sensor observation data and predicting evapotranspiration: Noisy and missing values repair and uncertainty quantification.” Water Resources Research 61.10 (2025): e2024WR039831.