Computer Sciences | Open Access Articles | Digital Commons Network™

Interpretable Timbre Synthesis Using Variational Autoencoders Regularized On Timbre Descriptors, Anastasia Natsiou, Luca Longo, Sean O'Leary Jul 2023

Interpretable Timbre Synthesis Using Variational Autoencoders Regularized On Timbre Descriptors, Anastasia Natsiou, Luca Longo, Sean O'Leary

Conference papers

Controllable timbre synthesis has been a subject of research for several decades, and deep neural networks have been the most successful in this area. Deep generative models such as Variational Autoencoders (VAEs) have the ability to generate a high-level representation of audio while providing a structured latent space. Despite their advantages, the interpretability of these latent spaces in terms of human perception is often limited. To address this limitation and enhance the control over timbre generation, we propose a regularized VAE-based latent space that incorporates timbre descriptors. Moreover, we suggest a more concise representation of sound by utilizing its harmonic …

Go to article

The Significance Of Sonic Branding To Strategically Stimulate Consumer Behavior: Content Analysis Of Four Interviews From Jeanna Isham’S “Sound In Marketing” Podcast, Ina Beilina May 2022

The Significance Of Sonic Branding To Strategically Stimulate Consumer Behavior: Content Analysis Of Four Interviews From Jeanna Isham’S “Sound In Marketing” Podcast, Ina Beilina

Student Theses and Dissertations

Purpose:
Sonic branding is not just about composing jingles like McDonald’s “I’m Lovin’ It.” Sonic branding is an industry that strategically designs a cohesive auditory component of a brand’s corporate identity. This paper examines the psychological impact of music and sound on consumer behavior reviewing studies from the past 40 years and investigates the significance of stimulating auditory perception by infusing sound in consumer experience in the modern 2020s.

Design/methodology/approach:
Qualitative content analysis of audio media was used to test two hypotheses. Four archival oral interview recordings from Jeanna Isham’s podcast “Sound in Marketing” featuring the sonic branding experts …

Go to article

Real Time Call-Flagging System To Respond To Suicidal Ideation In Call Centers, Vishnu Menon, Joseph Carrigan, Charles Floeder, Thomas Walton, Devin Mcguire May 2022

Real Time Call-Flagging System To Respond To Suicidal Ideation In Call Centers, Vishnu Menon, Joseph Carrigan, Charles Floeder, Thomas Walton, Devin Mcguire

Honors Theses

The 2021-2022 Signature Performance Design Studio team developed a live audio call-flagging system that enables faster responses and new response pathways to veteran crises by call service representatives and their management team. Using a custom made deep learning model, live audio streaming server, and Teams broadcasting add-on, the system empowers Signature Performance call service representatives to make quicker and more well informed decisions to provide veteran’s the best care possible.

Go to article

A New Way To Make Music: Processing Digital Audio In Virtual Reality, Gavin E. Payne Jan 2022

A New Way To Make Music: Processing Digital Audio In Virtual Reality, Gavin E. Payne

Senior Projects Spring 2022

The work of this project attempts to provide new methods of creating music with technology. The product, Fields, is a functional piece of virtual reality software, providing users an immersive and interactive set of tools used to build and design instruments in a modular manner. Each virtual tool is analogous to musical hardware such as guitar pedals, synthesizers, or samplers, and can be thought of as an effect or instrument on its own. Specific configurations of these virtual audio effects can then be played to produce music, and then even saved by the user to load up and play with …

Go to article

Pranayama Breathing Detection With Deep Learning, Bikash Shrestha Dec 2021

Pranayama Breathing Detection With Deep Learning, Bikash Shrestha

Theses

Yoga, a complementary health approach, according to a 2017 National Health Interview Survey by the Center for Disease Control and Prevention (CDC), is a choice of around 14.3% adults in the US. Kapalbhati pranayama, a yoga practice of alternating fast exhales and longer passive inhales, is understood to improve our health. Incorrect and irregular practices, however, can cause injuries and adverse effects. To avoid these undesired effects, it is essential to maintain a pace fit for the practitioner. In the absence of any tools to observe a pace of practice, this work develops a deep learning method that listens to …

Go to article

Detecting Symptoms Of Chronic Obstructive Pulmonary Disease And Congestive Heart Failure Via Cough And Wheezing Sounds Using Smart-Phones And Machine Learning, Anthony Windmon Sep 2020

Detecting Symptoms Of Chronic Obstructive Pulmonary Disease And Congestive Heart Failure Via Cough And Wheezing Sounds Using Smart-Phones And Machine Learning, Anthony Windmon

USF Tampa Graduate Theses and Dissertations

Chronic Obstructive Pulmonary Disease (COPD) and Congestive Heart Failure (CHF) are progressive disorders, and major health concerns among today’s aging population. COPD causes a large mucus buildup in the lungs, leading to chronic cough and difficulty to breathe. CHF causes fluid buildup in the lower lungs due to the failing heart, causing cough and difficulty to breath. People who are clinically diagnosed with COPD or CHF are expected to regularly monitor their symptoms and follow complex medical recommendations in an effort to prevent exacerbation. In this dissertation, we elaborate upon three different machine learning based techniques that we developed for …

Go to article

Spatial Cues Gradient In Time Domain Based Audio Attention Computational Model, Hang Bo, Wang Yi, Changqing Kang, Huang Jian Aug 2020

Spatial Cues Gradient In Time Domain Based Audio Attention Computational Model, Hang Bo, Wang Yi, Changqing Kang, Huang Jian

Journal of System Simulation

Abstract: In virtual reality audio, sound source whose directions change rapidly should have higher attention level. But present bottom-up audio attention computational models extract the underlying characteristics of single channel audio such as energy, pitch, zero crossing rate etc., which can not effectively express the audio attention caused by such signals. To solve this problem, based on the psychological principles that spatial information affects attention, a model was proposed to introduce the short-term spatial gradient cues to measure the attention caused by the single audio source space direction changing. Compared to the traditional audio attention computational model, the recall of …

Go to article

Geometry Aided Sonification, Michael Tecce Jul 2020

Geometry Aided Sonification, Michael Tecce

Mathematics and Computer Science Presentations

Sonification is the process of deriving an audio representation of a time series which conveys important information about that time series. Otology and vision science have established that humans process audio information more quickly than visual information, and sonification can convey data to the visually impaired. In our work, we implement pipelines using Python/Numpy, and we handle both ordinary 1D time series and multivariate time series. For 1D time series, we find that using data to modulate the pitch or timing of preselected sounds (such as sine waves) simply and effectively captures repeating patterns and anomalies/outliers within the data. To …

Go to article

Geometry Aided Sonification, Michael Tecce Jul 2020

Geometry Aided Sonification, Michael Tecce

Computer Science Summer Fellows

Go to article

Usability Of Sound-Driven User Interfaces, Zachary T. Roth, Dale R. Thompson May 2018

Usability Of Sound-Driven User Interfaces, Zachary T. Roth, Dale R. Thompson

Computer Science and Computer Engineering Undergraduate Honors Theses

The model for interacting with computing devices remains primarily focused on visual design. However, sound has a unique set of advantages. In this work, an experiment was devised where participants were tasked with identifying elements in an audio-only computing environment. The interaction relied on mouse movement and button presses for navigation. Experiment trials consisted of variations in sound duration, volume, and distinctness according to both experiment progress and user behavior. Participant interactions with the system were tracked to examine the usability of the interface. Preliminary results indicated the majority of participants mastered every provided test, but the total time spent …

Go to article

Type Iii Compensated Voltage Mode Line Feedforward Synchronously Rectified Boost Converter For Driving Class D Audio H-Bridge To Deliver 7 W Peak Power Into An 8 Omega Speaker, Yavuz Kiliç Jan 2015

Type Iii Compensated Voltage Mode Line Feedforward Synchronously Rectified Boost Converter For Driving Class D Audio H-Bridge To Deliver 7 W Peak Power Into An 8 Omega Speaker, Yavuz Kiliç

Turkish Journal of Electrical Engineering and Computer Sciences

In this paper, a circuit topology for a synchronously rectified boost converter aimed to act as a power supply for a Class D H-bridge that would deliver up to 7 W peak power into an 8 Omega speaker is presented. The design is implemented in a submicron BCD process technology. Some of the challenges in this design are as follows. The first challenge was to optimise the design such that it would not cause on-chip power dissipation levels that would go beyond acceptable levels for Wafer Level Chip Scale Package (WLCSP), which is a commonly used package for space constrained …

Go to article

Data Collection For The Similar Segments In Social Speech Task, Nigel G. Ward, Steven D. Werner Sep 2013

Data Collection For The Similar Segments In Social Speech Task, Nigel G. Ward, Steven D. Werner

Departmental Technical Reports (CS)

Information retrieval systems rely heavily on models of similarity, but for spoken dialog such models currently use mostly standard textual-content similarity. As part of the MediaEval Benchmarking Initiative, we have created a new corpus to support development of similarity models for spoken dialog. This corpus includes 26 casual dialogs among members of two semi-cohesive groups, totaling about 5 hours, with 1889 labeled regions associated into 227 sets which annotators judged to be similar enough to share a tag. This technical report brings together information about this corpus and its intended uses.

Go to article

Digital Audio In The Library, Richard Griscom Sep 2009

Digital Audio In The Library, Richard Griscom

Richard Griscom

An incomplete draft of a book intended to serve as a guide and reference for librarians who are responsible for implementing digital audio services in their libraries. The book is divided into two parts. Part 1, "Digital Audio Technology," covers the fundamentals of recorded sound and digital audio, including a description of digital audio formats, how digital audio is delivered to the listener, and how digital audio is created. Part 2, "Digital Audio in the Library," covers digitizing local collections, providing streaming audio reserves, and using digital audio to preserve analog recordings.

Go to article

Problems Of Music Information Retrieval In The Real World, Donald Byrd Jan 2002

Problems Of Music Information Retrieval In The Real World, Donald Byrd

Computer Science Department Faculty Publication Series

Although a substantial number of research projects have addressed music information retrieval over the past three decades, the field is still very immature. Few of these projects involve complex (polyphonic) music; methods for evaluation are at a very primitive stage of development; none of the projects tackles the problem of realistically large-scale databases. Many problems to be faced are due to the nature of music itself. Among these are issues in human perception and cognition of music, especially as they concern the recognizability of a musical phrase. This paper considers some of the most fundamental problems in music information retrieval, …

Go to article

Computer Sciences Commons^™

Full-Text Articles in Computer Sciences

Interpretable Timbre Synthesis Using Variational Autoencoders Regularized On Timbre Descriptors, Anastasia Natsiou, Luca Longo, Sean O'Leary

Conference papers

The Significance Of Sonic Branding To Strategically Stimulate Consumer Behavior: Content Analysis Of Four Interviews From Jeanna Isham’S “Sound In Marketing” Podcast, Ina Beilina

Student Theses and Dissertations

Real Time Call-Flagging System To Respond To Suicidal Ideation In Call Centers, Vishnu Menon, Joseph Carrigan, Charles Floeder, Thomas Walton, Devin Mcguire

Honors Theses

A New Way To Make Music: Processing Digital Audio In Virtual Reality, Gavin E. Payne

Senior Projects Spring 2022

Pranayama Breathing Detection With Deep Learning, Bikash Shrestha

Theses

Detecting Symptoms Of Chronic Obstructive Pulmonary Disease And Congestive Heart Failure Via Cough And Wheezing Sounds Using Smart-Phones And Machine Learning, Anthony Windmon

USF Tampa Graduate Theses and Dissertations

Spatial Cues Gradient In Time Domain Based Audio Attention Computational Model, Hang Bo, Wang Yi, Changqing Kang, Huang Jian

Journal of System Simulation

Geometry Aided Sonification, Michael Tecce

Mathematics and Computer Science Presentations

Geometry Aided Sonification, Michael Tecce

Computer Science Summer Fellows

Usability Of Sound-Driven User Interfaces, Zachary T. Roth, Dale R. Thompson

Computer Science and Computer Engineering Undergraduate Honors Theses

Type Iii Compensated Voltage Mode Line Feedforward Synchronously Rectified Boost Converter For Driving Class D Audio H-Bridge To Deliver 7 W Peak Power Into An 8 Omega Speaker, Yavuz Kiliç

Turkish Journal of Electrical Engineering and Computer Sciences

Data Collection For The Similar Segments In Social Speech Task, Nigel G. Ward, Steven D. Werner

Departmental Technical Reports (CS)

Digital Audio In The Library, Richard Griscom

Richard Griscom

Problems Of Music Information Retrieval In The Real World, Donald Byrd

Computer Science Department Faculty Publication Series