Physical Sciences and Mathematics | Open Access Articles

External-Memory Dictionaries With Worst-Case Update Cost, Rathish Das, John Iacono, Yakov Nekrich Dec 2022

External-Memory Dictionaries With Worst-Case Update Cost, Rathish Das, John Iacono, Yakov Nekrich

Michigan Tech Publications

The Bϵ-tree [Brodal and Fagerberg 2003] is a simple I/O-efficient external-memory-model data structure that supports updates orders of magnitude faster than B-tree with a query performance comparable to the B-tree: for any positive constant ϵ < 1 insertions and deletions take O(B11-ϵ logB N) time (rather than O(logB N) time for the classic B-tree), queries take O(logB N) time and range queries returning k items take O(logB N + Bk) time. Although the Bϵ-tree has an optimal update/query tradeoff, the runtimes are amortized. Another structure, the write-optimized skip list, introduced by Bender et al. [PODS 2017], has the same performance as the Bϵ-tree but with runtimes that are randomized rather than amortized. In this paper, we present a variant of the Bϵ-tree with deterministic worst-case running times that are identical to the original’s amortized running times.

Go to article

Improving Protein Succinylation Sites Prediction Using Embeddings From Protein Language Model, Suresh Pokharel, Pawel Pratyush, Michael Heinzinger, Robert H. Newman, Dukka Kc Oct 2022

Improving Protein Succinylation Sites Prediction Using Embeddings From Protein Language Model, Suresh Pokharel, Pawel Pratyush, Michael Heinzinger, Robert H. Newman, Dukka Kc

Michigan Tech Publications

Protein succinylation is an important post-translational modification (PTM) responsible for many vital metabolic activities in cells, including cellular respiration, regulation, and repair. Here, we present a novel approach that combines features from supervised word embedding with embedding from a protein language model called ProtT5-XL-UniRef50 (hereafter termed, ProtT5) in a deep learning framework to predict protein succinylation sites. To our knowledge, this is one of the first attempts to employ embedding from a pre-trained protein language model to predict protein succinylation sites. The proposed model, dubbed LMSuccSite, achieves state-of-the-art results compared to existing methods, with performance scores of 0.36, 0.79, 0.79 …

Go to article

Orthogonal Point Location And Rectangle Stabbing Queries In 3-D, Timothy M. Chan, Yakov Nekrich, Saladi Rahul, Konstantinos Tsakalidis Sep 2022

Orthogonal Point Location And Rectangle Stabbing Queries In 3-D, Timothy M. Chan, Yakov Nekrich, Saladi Rahul, Konstantinos Tsakalidis

Michigan Tech Publications

In this work, we present a collection of new results on two fundamental problems in geometric data structures: orthogonal point location and rectangle stabbing.• Orthogonal point location. We give the first linear-space data structure that sup- ports 3-d point location queries on n disjoint axis-aligned boxes with optimal O (log") query time in the (arithmetic) pointer machine model. This improves the previous 0 (\ogi/2 n^ bound of Rahul \SODA 201o|. We similarly obtain the first linear-space data structure in the I/O model with optimal query cost, and also the first linear-space data structure in the word HAM model with sub-logarithmic …

Go to article

Integrating Deep Learning And Hydrodynamic Modeling To Improve The Great Lakes Forecast, Pengfei Xue, Aditya Wagh, Gangfeng Ma, Yilin Wang, Yongchao Yang, Tao Liu, Chenfu Huang May 2022

Integrating Deep Learning And Hydrodynamic Modeling To Improve The Great Lakes Forecast, Pengfei Xue, Aditya Wagh, Gangfeng Ma, Yilin Wang, Yongchao Yang, Tao Liu, Chenfu Huang

Michigan Tech Publications

The Laurentian Great Lakes, one of the world’s largest surface freshwater systems, pose a modeling challenge in seasonal forecast and climate projection. While physics-based hydrodynamic modeling is a fundamental approach, improving the forecast accuracy remains critical. In recent years, machine learning (ML) has quickly emerged in geoscience applications, but its application to the Great Lakes hydrodynamic prediction is still in its early stages. This work is the first one to explore a deep learning approach to predicting spatiotemporal distributions of the lake surface temperature (LST) in the Great Lakes. Our study shows that the Long Short-Term Memory (LSTM) neural network, …

Go to article

A Few-Shot Learning Model Based On A Triplet Network For The Prediction Of Energy Coincident Peak Days, Jinxiang Liu, Laura Brown May 2022

A Few-Shot Learning Model Based On A Triplet Network For The Prediction Of Energy Coincident Peak Days, Jinxiang Liu, Laura Brown

Michigan Tech Publications

In an electricity system, a coincident peak (CP) is defined as the highest daily power demand in a year, which plays an important role in keeping the balance between power supply and its demand. Advanced information about the time of coincident peaks would be helpful for both utility companies and their customers. This work addresses the prediction of the five coincident peak days (5CP) in a year. We present a few-shot learning model to classify a day as a 5CP day or a non-5CP day 24-hours ahead. A triplet network is implemented for the 2-way-5-shot classifications on six different historical …

Go to article

Einstein-Roscoe Regression For The Slag Viscosity Prediction Problem In Steelmaking, Hiroto Saigo, Dukka Kc, Noritaka Saito Apr 2022

Einstein-Roscoe Regression For The Slag Viscosity Prediction Problem In Steelmaking, Hiroto Saigo, Dukka Kc, Noritaka Saito

Michigan Tech Publications

In classical machine learning, regressors are trained without attempting to gain insight into the mechanism connecting inputs and outputs. Natural sciences, however, are interested in finding a robust interpretable function for the target phenomenon, that can return predictions even outside of the training domains. This paper focuses on viscosity prediction problem in steelmaking, and proposes Einstein-Roscoe regression (ERR), which learns the coefficients of the Einstein-Roscoe equation, and is able to extrapolate to unseen domains. Besides, it is often the case in the natural sciences that some measurements are unavailable or expensive than the others due to physical constraints. To this …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

External-Memory Dictionaries With Worst-Case Update Cost, Rathish Das, John Iacono, Yakov Nekrich

Michigan Tech Publications

Improving Protein Succinylation Sites Prediction Using Embeddings From Protein Language Model, Suresh Pokharel, Pawel Pratyush, Michael Heinzinger, Robert H. Newman, Dukka Kc

Michigan Tech Publications

Orthogonal Point Location And Rectangle Stabbing Queries In 3-D, Timothy M. Chan, Yakov Nekrich, Saladi Rahul, Konstantinos Tsakalidis

Michigan Tech Publications

Integrating Deep Learning And Hydrodynamic Modeling To Improve The Great Lakes Forecast, Pengfei Xue, Aditya Wagh, Gangfeng Ma, Yilin Wang, Yongchao Yang, Tao Liu, Chenfu Huang

Michigan Tech Publications

A Few-Shot Learning Model Based On A Triplet Network For The Prediction Of Energy Coincident Peak Days, Jinxiang Liu, Laura Brown

Michigan Tech Publications

Einstein-Roscoe Regression For The Slag Viscosity Prediction Problem In Steelmaking, Hiroto Saigo, Dukka Kc, Noritaka Saito

Michigan Tech Publications