Skip to main content


Go Search
Department of Computer Science & Engineering
About the Department
Graduate Programs
Undergraduate Programs
Student Services
Wustl Engineering
Department of Computer Science & Engineering > Research > Technical Reports  

Technical Reports

Modify settings and columns
Capacity Augmentation Bound of Federated Scheduling for Parallel DAG TasksUse SHIFT+ENTER to open the menu (new window).
Jing Li, Abusayeed Saifullah,
Kunal Agrawal, Christopher Gill, and Chenyang Lu
9/13/2013 Reports/Attachments/1045/FS[1].pdf
We present a novel federated scheduling approach for parallel real-time tasks under a general directed acyclic graph (DAG) model. We provide a capacity augmentation bound of 2 for hard real-time scheduling; here we use the worst-case execution time and critical-path length of tasks to determine schedulability.  This is the best known capacity augmentation bound for parallel tasks. By constructing example task sets, we further show that the lower bound on capacity augmentation of federated scheduling is also 2 for any m > 2. Hence, the gap is closed and bound 2 is a strict bound for federated scheduling. The federated scheduling algorithm is also a schedulability test that often admits task sets with utilization much greater than 50%m.
CloudPowerCap: Integrating Power Budget and Resource Management across a Virtualized Server ClusterUse SHIFT+ENTER to open the menu (new window).
Yong Fu, Anne Holler, Chenyang Lu
3/21/2014 Reports/Attachments/1043/icac14-tech-report1.pdf
In many datacenters, server racks are highly underutilized.
Rack slots are left empty to keep the sum of the server
nameplate maximum power below the power provisioned
to the rack. And the servers that are placed in the rack
cannot make full use of available rack power. The root
cause of this rack underutilization is that the server nameplate power is often much higher than can be reached
in practice. To address rack underutilization, server vendors are shipping support for per-host power caps, which
provide a server-enforced limit on the amount of power
that the server can draw. Using this feature, datacenter
operators can set power caps on the hosts in the rack to
ensure that the sum of those caps does not exceed the
rack’s provisioned power. While this approach improves
rack utilization, it burdens the operator with managing
the rack power budget across the hosts and does not lend
itself to flexible allocation of power to handle workload
usage spikes or to respond to changes in the amount of
powered-on server capacity in the rack. In this paper we
present CloudPowerCap, a practical and scalable solution
for power budget management in a virtualized environment. CloudPowerCap manages the power budget for a
cluster of virtualized servers, dynamically adjusting the
per-host power caps for hosts in the cluster. Integrated
with VMware Distributed Resource Scheduler, CloudPowerCap can provide better use of power than per-host
static settings, while respecting virtual machine resource
entitlements and constraints
Real-Time Multi-Core Virtual Machine Scheduling in XenUse SHIFT+ENTER to open the menu (new window).
Sisu Xi, Meng Xu, Chenyang Lu, Linh T.X. Phan, Christopher Gill, Oleg Sokolsky, Insup Lee
11/26/2013 Reports/Attachments/1032/rtxenmc.pdf
Recent years have witnessed two major trends in
the development of complex real-time systems. First, to reduce
cost and enhance flexibility, multiple systems are sharing common
computing platforms via virtualization technology, instead of
being deployed separately on physically isolated hosts. Second,
multicore processors are increasingly being used in real-time
systems. The integration of real-time systems as virtual machines
(VMs) atop common multicore platforms raises significant new
research challenges in meeting the real-time performance requirements of multiple systems.
Real-Time Multi-Core Virtual Machine Scheduling in XenUse SHIFT+ENTER to open the menu (new window).
Sisu Xi, Meng Xu, Chenyang Lu, Linh T.X. Phan, Christopher Gill, Oleg Sokolsky, Insup Lee
Recent years have witnessed two major trends in
the development of complex real-time systems. First, to reduce cost and enhance flexibility, multiple systems are sharing common computing platforms via virtualization technology, instead of being deployed separately on physically isolated hosts. Second, multicore processors are increasingly being used in real-time systems. The integration of real-time systems as virtual machines (VMs) atop common multicore platforms raises significant new research challenges in meeting the real-time performance requirements of multiple systems.
Difficulties and Opportunities in Building Resilient Clinical Monitoring Systems with Wireless Sensor NetworksUse SHIFT+ENTER to open the menu (new window).
MS Thesis
Rahav Dor
9/9/2013 Reports/Attachments/1028/RahavMS_Thesis.pdf
Wireless Sensor Networks (WSN) can play an important role in improving patient care by collecting continuous vital signs and using it, in real-time, for clinical decision support. This Masters Thesis presents the architecture of, and our experiences with such a system. Our system encompasses portable wireless pulse oximeters, a wireless relay network spanning multiple hospital floors, and integration into the hospital Clinical Data Repository (CDR) database. We report our experience and lessons learned from a 14-month clinical trial of the system in six hospital wards of Barnes-Jewish Hospital in St. Louis, Missouri.

Such a deployment of a computer system around people daily lives makes it a Cyber Physical System (CPS) and exposes numerous challenges, where the advantages of WSN (for example being very low power) are in conflict with the unforgiving environment (for example overcoming interference in a heavily populated hospital, despite a low power transmission). As this Masters Thesis will show such a CPS can be made reliable, but there are numerous integration points in which the system can (and did) fail. Our experiences show the feasibility of achieving reliable vital sign collection, using a wireless sensor network integrated with hospital IT infrastructure and procedures. We highlight technical and non-technical elements that pose challenges in a real-world hospital environment and provide guidelines for successful and efficient deployment of similar systems. At the end of this paper we hope that we will have given enough evidence and recommendations to convince the reader that the question “Can CPS that uses a WSN enjoy the advantages of very low power wireless communication, and be made reliable in a hospital environment?” is no longer open.
A thesis presented to the School of Engineering of Washington University in partial fulfillment of the requirements for the degree of
Master of Science (Masters in Computer Engineering).
Self-Adapting MAC Layer for Wireless Sensor NetworksUse SHIFT+ENTER to open the menu (new window).
Mo Sha, Rahav Dor, Gregory Hackmann, Chenyang Lu,  Tae-Suk Kim, Taerim Park
9/6/2013 Reports/Attachments/1027/saml.pdf
The integration of wireless sensors with mobile phones is gaining momentum as an enabling platform for numerous emerging applications. These mobile systems face dynamic environments where both application requirements and ambient wireless conditions change frequently. Despite the existence of many MAC protocols however, none can provide optimal performance along multiple dimensions, in particular when the conditions are frequently changing. Instead of pursuing a one-MAC-fit all approach we present a Self-Adapting MAC Layer (SAML) comprising (1) a Reconfigurable MAC Architecture (RMA) that can switch to different MAC protocols at run time and (2) a learning-based MAC Selection Engine that selects the protocol most suitable for the current condition and requirements. As the ambient conditions or application requirements change SAML dynamically switches MAC protocols to gain the desired performance. To the application SAML appears as a traditional MAC protocol and its benefits are realized without troubling the application with the underlying complexity. To test the system we implement SAML in TinyOS 2.x and realize three prototypes containing up to five MACs. We evaluate the system in controlled tests and real-world environments using a new gateway device that integrates a 802.15.4 radio with Android phones. Our experimental results show that SAML provides an efficient and reliable MAC switching, while adheres to the application specified requirements.
Difficulties and Opportunities in Building Resilient Clinical Monitoring Systems wtih Wireless Sensor NetworksUse SHIFT+ENTER to open the menu (new window).
MS Thesis
Rahav Dor
Wireless Sensor Networks (WSN) can play an important role in improving patient care by collecting continuous vital signs and using it, in real-time, for clinical decision support. This Masters Thesis presents the architecture of, and our experiences with such a system. Our system encompasses portable wireless pulse oximeters, a wireless relay network spanning multiple hospital floors, and integration into the hospital Clinical Data Repository (CDR) database. We report our experience and lessons learned from a 14-month clinical trial of the system in six hospital wards of Barnes-Jewish Hospital in St. Louis, Missouri.

Such a deployment of a computer system around people daily lives makes it a Cyber Physical System (CPS) and exposes numerous challenges, where the advantages of WSN (for example being very low power) are in conflict with the unforgiving environment (for example overcoming interference in a heavily populated hospital, despite a low power transmission). As this Masters Thesis will show such a CPS can be made reliable, but there are numerous integration points in which the system can (and did) fail. Our experiences show the feasibility of achieving reliable vital sign collection, using a wireless sensor network integrated with hospital IT infrastructure and procedures. We highlight technical and non-technical elements that pose challenges in a real-world hospital environment and provide guidelines for successful and efficient deployment of similar systems. At the end of this paper we hope that we will have given enough evidence and recommendations to convince the reader that the question “Can CPS that uses a WSN enjoy the advantages of very low power wireless communication, and be made reliable in a hospital environment?” is no longer open.
Computational Aspects of Approval Voting and Declared-Strategy VotingUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Robert LeGrand III
5/16/2008 Reports/Attachments/1020/dissertation_LeGrand.pdf
Computational social choice is a relatively new discipline that explores issues at the intersection of social choice theory and computer science. Designing a protocol for collective decision-making is made difficult by the possibility of manipulation through insincere voting. In approval voting systems, voters decide whether to approve or disapprove available alternatives; however, the specific nature of rational approval strategies has not been adequately studied. This research explores aspects of strategy under three different approval systems, from chiefly a computational viewpoint.
EWA Model with Recency Effect and Limited MemoryUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Hang Xie
5/21/2013 Reports/Attachments/1017/Xie_project1.pdf
Game theory is an important field in economics; it studies how people make decisions amid conflict and cooperation. Various experiments have been carried to study the way people play those games, and economists study those data for various purposes. There has been a rise of need for using artificial agents to simulate the game, since we could save the cost of hiring human subjects for the experiments, and we could gain more control over the experiment settings.
Kernel Density Metric LearningUse SHIFT+ENTER to open the menu (new window).
Yujie He, Wenlin Chen, Yixin Chen
5/12/2013 Reports/Attachments/1016/kdd-2013.pdf
This paper introduces a supervised metric learning algorithm, called kernel density metric learning (KDML), which is easy to use and provides nonlinear, probability-based distance measures. KDML constructs a direct nonlinear mapping from the original input space into a feature space based on kernel density estimation. The nonlinear mapping in KDML embodies established distance measures between probability density functions, and leads to correct classification on datasets for which linear metric learning methods would fail. Existing metric learning algorithms, such as large margin nearest neighbors (LMNN), can then be applied to the KDML features to learn a Mahalanobis distance. We also propose an integrated optimization algorithm that learns not only the Mahalanobis matrix but also kernel bandwidths, the only hyper-parameters in the nonlinear mapping. KDML can naturally handle not only numerical features, but also categorical ones, which is rarely found in previous metric learning algorithms. Extensive experimental results on various benchmark datasets show that KDML significantly improves existing metric learning algorithms in terms of kNN classification accuracy.
Automated Color Calibration of Display DevicesUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Andrew Shulman
5/10/2013 Reports/Attachments/1015/Shulman_project.pdf
If you compare two identical images on two different monitors, they will likely appear different. Every display device is supposed to adhere to a particular set of standards regulating the color and intensity of the image it outputs. However, in practice, very few do. Color calibration is the practice of modifying the signal path such that the colors produced more closely match reference standards.

This is essential for graphics professionals who are mastering original content. They must ensure that the source material appears correct when viewed on a reference monitor. When viewed on a consumer panel, however, some error will appear on an uncalibrated panel. On low-end devices, calibration helps to maximize picture quality, producing an image comparable to a more expensive panel out of the box. On high-end devices, it can reduce color error to levels imperceptible to the human eye.
Wireless Cyber-Physical Simulator and Case Studies on Structural ControlUse SHIFT+ENTER to open the menu (new window).
MS Thesis
Bo Li
4/26/2013 Reports/Attachments/1014/wcps-thesis-final.pdf
Wireless Structural Control (WSC) systems can play a crucial role in protecting civil infrastructure in the event of earthquakes and other natural disasters. Such systems represent an exemplary class of cyber-physical systems that perform close-loop control using wireless sensor networks. Existing WSC research usually employs wireless sensors installed on small lab structures, which cannot capture realistic delays and data loss in wireless sensor networks deployed on large civil structures. The lack of realistic tools that capture both the cyber (wireless) and physical (structural) aspects of WSC systems has been a hurdle for cyber-physical systems research for civil infrastructure. This advances the state of the art through the following contributions. First, we developed the Wireless Cyber-Physical Simulator (WCPS), an integrated environment that combines realistic simulations of both wireless sensor networks and structures. WCPS integrates Simulink and TOSSIM, a state-of-the-art sensor network simulator featuring a realistic wireless model seeded by signal traces. Second, we performed two realistic case studies each combining a structural model with wireless traces collected from real-world environments. The building study combines a benchmark building model and wireless traces collected from a multi-story building. The bridge study combines the structural model of the Cape Girardeau bridge over the Mississippi River and wireless traces collected from a similar bridge (the Jindo Bridge) in South Korea. These case studies shed light on the challenges of WSC and the limitations of a traditional structural control approach under realistic wireless conditions. Finally, we proposed a cyber-physical co-design approach to WSC that integrates a novel holistic scheduling scheme (for sensing, communication and control) and an Optimal Time Delay Controller (OTDC) that substantially improves structural control performance.
Parallel Real-Time Scheduling of DAGsUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah, David Ferry, Jing Li,   Kunal Agrawal, Chenyang Lu, and   Christopher Gill
4/23/2013 Reports/Attachments/1013/dag_rep.pdf
Recently, multi-core processors have become mainstream in processor   design.  To take full advantage of multi-core processing,   computation-intensive real-time systems must exploit intra-task   parallelism.   In this paper, we address the open problem of  real-time scheduling for  a general model of deterministic parallel tasks, where each task is   represented as a directed acyclic graph (DAG) with  nodes having arbitrary execution requirements.  We prove processor-speed   augmentation bounds for both preemptive and non-preemptive real-time   scheduling for general DAG tasks on multi-core processors.  We first   decompose each DAG into sequential tasks with their own release times and deadlines.  Then we prove that these decomposed tasks can be scheduled using preemptive global EDF with a resource   augmentation bound of 4.  This bound is as good as the best known   bound for more restrictive models, and is the first for a general   DAG model.  We also prove that the decomposition has a resource   augmentation bound of 4 plus a non-preemption overhead for   non-preemptive global EDF scheduling. To our knowledge, this is the   first resource augmentation bound for non-preemptive scheduling of parallel tasks.  Finally, we evaluate our analytical results through   simulations that demonstrate that the derived bounds are safe,  and   reasonably tight in practice, especially under preemptive EDF scheduling.
Learning with Single View Co-training and Marginalized DropoutUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Minmin Chen
4/9/2013 Reports/Attachments/1012/ChenDissertation.pdf
The generalization properties of most existing machine learning techniques are predicated on the assumptions that 1) a sufficiently large quantity of training data is available; 2) the training and testing data come from some common distribution. Although these assumptions are often met in practice, there are also many scenarios in which training data from the relevant distribution is insufficient. We focus on making use of additional data, which is readily available or can be obtained easily but comes from a different distribution than the testing data, to aid learning.
We present five learning scenarios, depending on how the distribution we used to sample the additional training data differs from the testing distribution: 1) learning with weak supervision; 2) domain adaptation; 3) learning from multiple domains; 4) learning from corrupted data; 5) learning with partial supervision.
We introduce two strategies and manifest them in five ways to cope with the difference between the training and testing distribution. The first strategy, which gives rise to Pseudo Multi-view Co-training (PMC) and Co-training for Domain Adaptation (CODA), is inspired by the co-training [23] algorithm for multi-view data. PMC generalizes co-training to the more common single view data and allows us to learn from weakly labeled data retrieved free from the web. CODA integrates PMC with an another feature selection component to address the feature incompatibility between domains for domain adaptation. PMC and CODA are evaluated on a variety of real datasets, and both yield record performance.
The second strategy marginalized dropout leads to marginalized Stacked Denoising Autoencoders (mSDA), Marginalized Corrupted Features (MCF) and FastTag (FastTag). mSDA diminishes the difference between distributions associated with different domains by learning a new representation through marginalized corruption and reconstruciton. MCF learns from a known distribution which is created by corrupting a small set of training data, and improves robustness of learned classifiers by training on "infinitely" many data sampled from the distribution. FastTag applies marginalized dropout to the output of partially labeled data to recover missing labels for multi-label tasks. These three algorithms not only achieve the state-of-art performance in various tasks, but also deliver orders of magnitude speed up at training and testing comparing to competing algorithms.
Network Efficiency in Peer-to-peer Data DistributionUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Shakir James
12/28/2012 Reports/Attachments/1011/JamesDissertation.pdf
Data distribution applications transfer data les on computer networks. On these networks, peer-to-peer applications transfer bulk data quickly to many destinations. But existing applications increase network costs on the Internet and underuse network capacity on the data center network.
To address these network ineciencies, this thesis introduces two applications. First, it proposes an Internet application that decreases the costly trac of peer-to-peer applications while maintaining their download times. This approach is novel because it works without changing existing peer-to-peer applications on the Internet.
Second, after identifying the problem with existing applications in data centers, this thesis proposes a peer-to-peer application that increases its usage of the network capacity and maintains the speed of other applications. This approach is novel because it is more resilient than similar peer-to-peer applications on the data center network.
Efficient Parallel Real-Time Upsampling of Ultrasound VectorsUse SHIFT+ENTER to open the menu (new window).
William D. Richard, Ph.D.
3/7/2013 Reports/Attachments/1010/Efficient Parallel Upsampling of Ultrasound Vectors_Richard.pdf
Upsampling is required prior to the summation step in most receive digital beamforming implementations to produce an accurate summed RF line or vector.  This is true in both annular and linear array systems where receive echos are digitized first and then time delayed in the digital domain to achieve proper signal alignment.  The efficient, parallel, real-time upsampling circuit presented here produces M upsampled values per ADC clock, where M is the desired upsampling factor.  A circuit implementation that upsamples by a factor of M=4 is presented as an example of the more general technique.
Adding Data Parallelism to Streaming Pipelines for Throughput OptimizationUse SHIFT+ENTER to open the menu (new window).
Peng Li, Kunal Agrawal, Jeremy Buhler, Roger D. Chamberlain
2/20/2013 Reports/Attachments/1009/add_data_parallel.pdf
The streaming model is a popular model for writing high-throughput parallel applications. A streaming application is represented by a graph of computation stages that communicate with each other via FIFO channels. In this report, we consider the problem of mapping streaming pipelines — streaming applications where the graph is a linear chain — in order to maximize throughput. In a parallel setting, subsets of stages, called components can be mapped onto different computing resources. The through-put of an application is determined by the throughput of the slowest component. Therefore, if some stage is much slower than others, then it may be useful to replicate the stage’s code and divide its workload among two or more replicas in order to increase throughput. However, pipelines may consist of some replicable and some non-replicable stages. In this paper, we address the problem of mapping these partially replicable streaming pipelines on both homogeneous and heterogeneous platforms so as to maximize throughput.

We consider two types of platforms, homogeneous platforms — where all resources are identical, and heterogeneous platforms — where resources may have different speeds. In both cases, we consider two network topologies — unidirectional chain and clique. We provide polynomial-time algorithms for mapping partially replicable pipelines onto unidirectional chains for both homogeneous and heterogeneous platforms. For homogeneous platforms, the algorithm for unidirectional chains generalizes to clique topologies. However, for heterogeneous platforms, mapping these pipelines onto clique topologies is NP-complete. We provide heuristics to generate solutions for cliques by applying our chain algorithms to a series of chains sampled from the clique. Our empirical results show that these heuristics rapidly converge to near-optimal solutions.
Analysis of Global EDF for Parallel TasksUse SHIFT+ENTER to open the menu (new window).
Jing Li, Kunal Agrawal, Chenyang Lu, Christopher Gill
Department of Computer Science and Engineering
Washington University in St. Louis
St. Louis, MO, USA
2/16/2013 Reports/Attachments/1008/ECRTS_dagedf_li.pdf
As multicore processors become ever more prevalent, it is important for real-time programs to take advantage of intra-task parallelism in order to support computation-intensive applications with tight deadlines. We prove that a Global Earliest Deadline First (GEDF) scheduling policy provides a capacity augmentation bound of 4-2/m and a resource augmentation bound of 2-1/m for parallel tasks in the general directed acyclic graph model. For the proposed capacity augmentation bound of 4-2/m for implicit deadline tasks under GEDF, we prove that if a task set has a total utilization of at most m/(4-2/m) and each task’s critical path length is no more than 1/(4-2/m) of its deadline, it can be scheduled on a machine with m processors under GEDF. Our capacity augmentation bound therefore can be used as a straightforward schedulability test. For the standard resource augmentation bound of 2-1/m for arbitrary deadline tasks under GEDF, we prove that if an ideal optimal scheduler can schedule a task set on m unit-speed processors, then GEDF can schedule the same task set on m processors of speed 2-1/m. However, this bound does not lead to a schedulabilty test since the ideal optimal scheduler is only hypothetical and is not known. Simulations confirm that the GEDF is not only safe under the capacity augmentation bound for various randomly generated task sets, but also performs surprisingly well and usually outperforms an existing scheduling technique that involves task decomposition.
Real-Time Scheduling of Parallel Tasks: Theory and Practice at Washington University in St. Louis
Simple Analytic Performance Models for Streaming Data Applications Deployed on Diverse ArchitecturesUse SHIFT+ENTER to open the menu (new window).
Jonathan C. Beard, Roger D. Chamberlain and Mark A. Franklin
2/11/2013 Reports/Attachments/1007/simplemodelstechreport.pdf
Modern hardware is inherently heterogeneous. With heterogeneity comes multiple abstraction layers that hide underlying complex systems. While hidden, this complexity makes quantitative performance modeling a difficult task. Designers of high-performance streaming applications for heterogeneous systems must contend with unpredictable and often non-generalizable models to predict performance of a particular application and hardware mapping. This paper outlines a computationally simple approach that can be used to model the overall throughput and buffering needs of a streaming application on heterogeneous hardware. The model presented is based upon a hybrid maximum flow and decomposed discrete queueing model. The utility of the model is assessed using a set of real and synthetic benchmarks with model predictions compared to measured application performance.
LCScanner: An Efficient and Accurate Trimming Tool for Illumina Next Generation Sequencing ReadsUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Xiang Zhou
1/8/2013 Reports/Attachments/1006/Project_XiangZhou.pdf
Recent advances in High-Throughput Sequencing (HTS) technology have greatly facilitated the researches in bioinformatics field. With the ultra-high sequencing speed and improved base-calling accuracy, Illumina Genome Analyzer is currently the most widely used platform in the field. To use the raw reads generated from the sequencing machine, the 3’ adapter sequence attached to the real read in the process of ligation needs to be correctly trimmed. This is often done by some inhouse scripts or different packages with various parameters. They either use the Smith-Waterman algorithm or search for an exact match of the 3’ adapter sequence.

In this report, I investigated methodologies as well as the strengths and weaknesses of five representative mainstream adapter trimming tools in order to suggest a direction for other researchers. Furthermore, four sets of detailed analysis were performed to evaluate the performances of these tools. I demonstrated that my adapter trimming method is flexible, accurate and efficient for Next Generation Sequencing (NGS) analysis.
A 3D Selection & Query Tool for the GeneAtlas ProjectUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Donald McCurdy
12/27/2012 Reports/Attachments/1005/McCurdy_project.pdf
In this project, I present an application to view, interact with, and search 3D medical volumes as part of the GeneAtlas project.
Implementation of Real-Time Calibration of Polarization Imaging SensorsUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Collin Foster
12/20/2012 Reports/Attachments/1004/Collin_Project.pdf
Recent breakthroughs in nanofabrication techniques have led to development of sophisticated Division-of-Focal-Plane (DoFP) polarization imaging sensors. One such technique allows the fabrication of nanowire filters fabricated directly on the imaging sensor itself. This technique can be used to fabricate robust DoFP polarization imaging sensors. However, the polarization information captured by the imagers can be degraded due to imperfections in the fabrication of the nanowire filters on the imaging sensor. Polarization information can also be degraded from other sources including crosstalk between pixels. To compensate for these undesired effects, a calibration routine can be applied to each pixel after image capture. In this project, we implement a calibration routine as a step in the pipeline processing of captured polarization images. The polarization image processing and calibration are implemented on an FPGA development board to achieve a real-time response to image captures.
Rapid Development Environments for Therapy Games: Looking Glass Therapy Games for Cerebral Palsy Treatment Utilizing the KinectUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Melynda Eden
12/11/2012 Reports/Attachments/1003/MelyndaEden.pdf
Cerebral palsy is a group of neurological disorders that impair body movement, muscle coordination, hearing, vision, and cognitive function. Symptoms vary but can include muscle weakness, muscle and joint tightness, abnormal or unsteady gait, seizures, learning disabilities, speech problems, as well as hearing or vision problems [1]. Although cerebral palsy cannot be cured, treatments such as physical and occupational therapy can greatly help affected children develop motor skills needed to increase mobility and foster independence [2]. Computer based therapy games have shown promise in helping stroke survivors recover from stroke [3]. Initially, stroke therapy games developed in Looking Glass utilized Nintendo Wii remotes (informally known as Wiimotes) to sense user’s movements. Challenges unfolded with stroke patients who were unable to grasp Wiimotes, thereby limiting and inhibiting game development and the user experience [3]. In this paper, I describe my efforts to integrate the Microsoft Kinect with Looking Glass and build therapy games that utilize the Kinect to track user movements. I detail the Kinect integration and discuss its advantages of seated skeletal tracking with no hand held devices required by the user.
Foveon F13 CameraUse SHIFT+ENTER to open the menu (new window).
MS Project Report
David Shelley
12/7/2012 Reports/Attachments/1002/ShelleyDavid.pdf
Most high-fidelity digital cameras currently available obtain their images using technology where individual pixels can only acquire a single color. Since the acquisition of multiple colors is necessary to capture a full colored image, picture resolution is lost due to difficulties in interpolation between non-adjacent, same color pixels. The resulting unsharp images create the need to find a new way to obtain these images without requiring interpolation between adjacent pixels.

An innovative method to capture images with individual pixel cells consisting of three layers of photodetectors stacked vertically upon one another has been created to rectify this problem, but no complete camera system has been fabricated to date. This project integrates a stacked photo detector imager chip into a complete camera, allowing full functionality and control of the photo detector chip to a personal computer. A six layer printed circuit board is fabricated using Cadence Allegro, firmware is written and compiled using Xilinx ISE and software is written in Microsoft Visual Studio 2010.
Data collection and performance monitoring of real-time parallel systemsUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Mahesh Mahadevan
11/27/2012 Reports/Attachments/1001/Mahadevan_project.pdf
Instrumentation and log data monitoring are important features present in many different applications today that are used in analysis of a system behaviour.  They are largely used in collection of key information from the system which might allow us to describe or establish certain facts about the behavior of the system. Instrumentation in any application usually comes along with its overheads. In some cases , code for instrumentation might become more expensive on the underlying resources than the application itself. For many applications, especially real-time applications , this overhead can cause serious and often catastrophic system failures. Collecting useful information from a real-time application has to be without interfering with the performance of the application. In this paper , I describe a mechanism of collection and visualization of performance data from a real-time concurrency platform in linux , without interfering and ensuring the correctness of running real-time parallel tests and experiments. This paper describes two different forms of data collection 1) Collection of timing related information from various critical phases in a real-time experiment , 2) Collection of cpu preemption information on running experiment threads. Data collected from both these stages are further fed to a visualization tool for post-debugging analysis of real-time experiment.
Limitations and Solutions for Real-Time Local Inter-Domain Communication in XenUse SHIFT+ENTER to open the menu (new window).
Sisu Xi, Chong Li, Chenyang Lu, and Christopher Gill
10/15/2012 Reports/Attachments/1000/rtxenio.pdf
As computer hardware becomes increasingly powerful, there is an ongoing trend towards integrating complex, legacy real-time systems using fewer hosts through virtualization. Especially in embedded systems domains such as avionics and automotive engineering, this kind of system integration can greatly reduce system weight, cost, and power requirements. When systems are integrated in this manner, \emph{network communication} may become local \emph{inter-domain communication} (IDC) within the same host. This paper examines the limitations of inter-domain communication in Xen, a widely used open-source virtual machine monitor (VMM) that recently has been extended to support real-time domain scheduling. We find that both the VMM scheduler and the manager domain can significantly impact real-time IDC performance under different conditions, and show that improving the VMM scheduler alone cannot deliver real-time performance for local IDC. To address those limitations, we present the RTCA, a Real-Time Communication Architecture within the manager domain in Xen, along with empirical evaluations whose results demonstrate that the latency of communication tasks can be improved dramatically from ms to $\mu$s by a combination of the RTCA and a real-time VMM scheduler.
A Memory Access Model for Highly-threaded Many-core ArchitecturesUse SHIFT+ENTER to open the menu (new window).
Lin Ma, Kunal Agrawal, Roger D. Chamberlain
10/1/2012 Reports/Attachments/999/TechnicalReport_ma.pdf
Many-core architectures are excellent in hiding memory-access latency by low-overhead context switching among a large number of threads. The speedup of algorithms carried out on these machines depends on how well the latency is hidden. If the number of threads were infinite, then theoretically these machines should provide the performance predicted by the PRAM analysis of the programs. However, the number of allowable threads per processor is not infinite. In this paper, we introduce the Threaded Many-core Memory (TMM) model which is meant to capture the important characteristics of these highly-threaded, many-core machines. Since we model some important machine parameters of these machines, we expect analysis under this model to give more fine-grained performance prediction than the PRAM analysis. We analyze 4 algorithms for the classic all pairs shortest paths problem under this model. We find that even when two algorithms have the same PRAM performance, our model predicts different performance for some settings of machine parameters. For example, for dense graphs, the Floyd-Warshall algorithm and Johnson’s algorithms have the same performance in the PRAM model. However, our model predicts different performance for large enough memory-access latency and validates the intuition that the Floyd-Warshall algorithm performs better on these machines.
Work submitted and accepted by ICPADS'2012
Studying Network Optimization in the Context of Self-StabilizationUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah
9/21/2012 Reports/Attachments/998/Saifullah.pdf
Self-stabilization is a theoretical framework of non-masking fault-tolerance for distributed networks. A self-stabilizing system is capable of tolerating any unexpected transient fault without outside intervention and, regardless of the initial state, it can converge to a legitimate global state, a predefined vector of local states, in finite time. Self-stabilization has rendered a good problem solving paradigm of networks over the last decade. In this paper, we survey the self-stabilizing solutions for various network optimization problems such as network flow, load balancing, load and resource distribution, routing, file distribution, shortest paths etc. The paper  also summarizes some recent works presenting how the convergence of a self-stabilizing distributed network can be modelled as a convex optimization problem with the exploitation of an analogy between self-stabilizing systems and stable feedback systems. The works pertaining to gradient adaptation of self-stabilizing system are also presented.
Self-Stabilization in the Distributed Systems of Finite State MachinesUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah
9/21/2012 Reports/Attachments/997/Self_stabilization.pdf
The notion of self-stabilization was first proposed by Dijkstra in 1974 in his classic paper. The paper defines a system as self-stabilizing if, starting at any, possibly illegitimate, state the system can automatically adjust itself to eventually converge to a legitimate state in finite amount of time and once in a legitimate state it will remain so unless it incurs a subsequent transient fault. Dijkstra limited his attention to a ring of finite-state machines and provided its solution for self-stabilization. In the years following his introduction, very few papers were published in this area. Once his proposal was recognized as a milestone in work on fault tolerance, the notion propagated among the researchers rapidly and many researchers in the distributed systems diverted their attention to it. The investigation and use of self-stabilization as an approach to fault-tolerant behavior under a model of transient failures for distributed systems is now undergoing a renaissance. A good number of works pertaining to self-stabilization in the distributed systems were proposed in the yesteryears most of which are very recent. This report surveys all previous works available in the literature of self-stabilizing systems.
Computational Study of Small Noncoding RNAs and Their FunctionsUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Xuefeng Zhou
9/10/2012 Reports/Attachments/996/zhou-dissertation.pdf
Post-transcriptional gene regulation at the RNA level has been recently shown to be more widespread and important than previously assumed. While various regulatory RNA molecules have been reported in animals and plants, two prominent types of regulatory small RNAs are microRNAs (miRNAs) and endogenous short interfering RNAs (siRNAs). Because of their importance, their nature and the difficulties in studying them, research in miRNAs has been an active research topic with many computational challenges.

First, computational strategies for miRNA identification have been developed to overcome the technical hurdles for experimental methods based on expression screening. We propose and develop a novel ranking algorithm based on random walks to computationally predict novel miRNAs from genomes, which have a few known miRNAs, may be poorly annotation and even not completely assembled. We also develop meta-feature based classification method to identify miRNAs from high-throughput sequencing data of small RNAs. In addition, we devise a pipeline to analyze natsiRNA in high-throughput sequencing data.

Secondly, we formulate the problem of promoter prediction based on multiple instance learning scheme, and propose an effective promoter identification algorithm, called CoVote. We apply CoVote to predict microRNA core promoters. We investigate core promoter regions of microRNA genes in Caenorhabditis elegans, Homo sapiens, Arabidopsis thaliana and Oryza sativa, and further analyze sequence motifs in the putative core promoters which may be involved in the transcription of microRNA genes.

Furthermore, with characterized promoters of miRNA genes, we apply data mining approaches to model the transcriptome of miRNAs under particular conditions. Additionally, by integrating miRNA target genes, we further analyze the miRNA-mediate regulatory networks and computationally identify network motifs. Since miRNAs and their targets can be formulated as a natural bipartite network(graph), we propose and develop a tool to study modules in the miRNA-regulatory network.
Reading Your Own Mind: Dynamic Visualization Of Real-time Neural SignalsUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Zachary Freudenburg
9/7/2012 Reports/Attachments/995/Freudenburg.pdf
Brain Computer Interfaces (BCI) systems which allow humans to control external devices directly from brain activity, are becoming increasingly popular due to dramatic advances in the ability to both capture and interpret brain signals. Further advancing BCI systems is a compelling goal both because of the neurophysiology insights gained from deriving a control signal from brain activity and because of the potential for direct brain control of external devices in applications such as brain injury recovery, human prosthetics, and robotics. The dynamic and adaptive nature of the brain makes it difficult to create classifiers or control systems that will remain effective over time. However it is precisely these qualities that offer the potential to use feedback to build on simple features and create complex control features that are robust over time. This dissertation presents work that addresses these opportunities for the specific case of Electrocorticography (ECoG) recordings from clinical epilepsy patients. First, queued patient tasks were used to explore the predictive nature of both local and global features of the ECoG signal. Second, an algorithm was developed and tested for estimating the most informative features from naive observations of ECoG signal. Third, a software system was built and tested that facilitates real-time visualizations of ECoG signal patients and allows ECoG epilepsy patients to engage in an interactive BCI control feature screening process.
Modeling Surfaces from Volume Data Using Nonparallel ContoursUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Ross Sowell
8/31/2012 Reports/Attachments/994/Sowell.pdf
Magnetic resonance imaging (MRI) and computed tomography (CT) scanners have long been used to produce three-dimensional (3D) samplings of anatomy elements for use in medical visualization and analysis. From such data sets, physicians often need to construct surfaces representing anatomical shapes in order to conduct treatment, such as irradiating a tumor. Traditionally, this is done through a time-consuming and error-prone process in which an experienced scientist or physician marks a series of parallel contours that outline the structures of interest. Recent advances in surface reconstruction algorithms have led to methods for reconstructing surfaces from nonparallel contours that could greatly reduce the manual component of this process. Despite these technological advances, the segmentation process has remained unchanged.

This dissertation takes the first steps toward bridging the gap between the new surface reconstruction technologies and bringing those methods to use in clinical practice. We develop VolumeViewer [68], a novel interface for modeling surfaces from volume data by allowing the user to sketch contours on arbitrarily oriented cross-sections of the volume. We design the algorithms necessary to support nonparallel contouring, and we evaluate the system with medical professionals using actual patient data. In this way, we begin to understand how nonparallel contouring can aid the segmentation process and expose the challenges associated with a nonparallel contouring system in practice.
Delaunay-restricted Optimal Triangulation of 3D PolygonsUse SHIFT+ENTER to open the menu (new window).
Ming Zou, Tao Ju, and Nathan Carr
7/2/2012 Reports/Attachments/993/pringleSGP.pdf
Triangulation of 3D polygons is a well studied topic of research.  Existing methods for finding triangulations that minimize given metrics (e.g., sum of triangle areas or dihedral angles) run in a costly $O(n^4)$ time \cite{Barequet95,Barequet96}, while the triangulations are not guaranteed to be free of intersections. To address these limitations, we restrict our search to the space of triangles in the Delaunay tetrahedralization of the polygon. The restriction allows us to reduce the running time down to $O(n^2)$ in practice ($O(n^3)$ worst case) while guaranteeing that the solutions are intersection free. We demonstrate experimentally that the reduced search space is not overly restricted. In particular, triangulations restricted to this space usually exist for  practical inputs, and the optimal triangulation in this space approximates well the optimal triangulation of the polygon. This makes our algorithms a practical solution when working with real world data.
Work submitted to but not accepted by SGP'12.
MCFlow: Middleware for Mixed-Criticality Distributed Real-Time SystemsUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Huang-Ming Huang
6/26/2012 Reports/Attachments/992/thesis-main_Huang.pdf
Traditional fixed-priority scheduling analysis for periodic/sporadic task sets is based on the assumption that all tasks are equally critical to the correct operation of the system. Therefore, every task has to be schedulable under the scheduling policy, and estimates of tasks’ worst case execution times must be conservative in case a task runs longer than is usual. To address the significant under-utilization of a system’s resources under normal operating conditions that can arise from these assumptions, several mixed-criticality scheduling approaches have been proposed. However, to date there has been no quantitative comparison of system schedulability or run-time overhead for the different approaches.

In this dissertation, we present what is to our knowledge the first side-by-side implementation and evaluation of those approaches, for periodic and sporadic mixed-criticality tasks on uniprocessor or distributed systems, under a mixed-criticality scheduling model that is common to all these approaches. To make a fair evaluation of mixed-criticality scheduling, we also address some previously open issues and propose modifications to improve schedulability and correctness of particular approaches.

To facilitate the development and evaluation of mixed-criticality applications, we have designed and developed a distributed real-time middleware, called MCFlow, for mixed-criticality end-to-end tasks running on multi-core platforms. The research presented in this dissertation provides the following contributions to the state of the art in real-time middleware: (1) an efficient component model through which dependent subtask graphs can be configured flexibly for execution within a single core, across cores of a common host, or spanning multiple hosts; (2) support for optimizations to inter-component communication to reduce data copying without sacrificing the ability to execute subtasks in parallel; (3) a strict separation of timing and functional concerns so that they can be configured independently; (4) an event dispatching architecture that uses lock free algorithms where possible to reduce memory contention, CPU context switching, and priority inversion; and (5) empirical evaluations of MCFlow itself and of different mixed criticality scheduling approaches both with a single host and end-to-end across multiple hosts. The results of our evaluation show that in terms of basic distributed real-time behavior MCFlow performs comparably to the state of the art TAO real-time object request broker when only one core is used and outperforms TAO when multiple cores are involved. We also identify and categorize different use cases under which different mixed criticality scheduling approaches are preferable.
Specializing Interfaces for Citizen Science Segmentation of Volumetric DataUse SHIFT+ENTER to open the menu (new window).
Michelle Vaughan, Cindy Grimm, Ruth West, Ross Sowell, Robert Pless and Stephen Kobourov
6/14/2012 Reports/Attachments/991/MG_CSSeg-sub.pdf
Segmentation of 3D and time-varying volumetric (4D) image data is considered a time and resource intensive bottleneck in scientific endeavors. Automatic methods are becoming more reliable, but many data sets still require manual intervention. This can mainly be attributed to the characteristics of the image data not being amenable to automated methods, the existence of variations in or poor image quality, or the need for an expert to review and edit results from an automatic technique. Manually segmenting volumetric data is a challenge even for those more experienced. Understanding the 3D nature of the data and navigating through the 3D environment poses some of the main difficulties of the task. Understanding what it means to segment data, and what a contour and a set of contours portrays is key to producing meaningful and usable results. Our goal is to construct a system that will allow even novices to segment 3D data and produce results that experts and scientists can review and use. Through the use of fe tures such as tutorials, contouring protocols, navigation aids, and other interface enhancements, we believe that we can create a workflow that will allow any level of user to work with and understand 3D data. We propose a guided segmentation system that (1) lets experts create data specific segmentation aids, (2) helps users create a meaningful segmentation using these aids, and (3) allows the experts to use the results to provide helpful information to scientists.
The Clear Channel PriorUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Devorah Langsam
5/8/2012 Reports/Attachments/990/DevorahLangsam.pdf
Capturing imagery from outdoor cameras provides a large amount of information about a scene. The true surface appearances of elements in a scene, however, are often incorrectly represented in images. To get a better representation of the scene it is necessary to separate the effects of the underlying reflectance, illumination, and fog in the image. The goal of the dark channel prior is to eliminate the effects of haze in outdoor images and recover the true surface reflectance image for the scene.
Common DMA Engine InterfaceUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Roger Alessi
5/8/2012 Reports/Attachments/989/Roger Alessi.pdf
Circuit boards with Field Programmable Gate Arrays (FPGAs) have a historically diverse set of standards for communicating with other devices. This provides a challenge for developers creating FPGA applications and makes migration of applications from one board or FPGA to another difficult.

Many board manufacturers, including Xilinx and GiDEL, create boards with Peripheral Component Interconnect Express (PCIe) buses. PCIe provides a low-level standard for transferring data between a Central Processing Unit (CPU) and an FPGA, and these manufacturers have designed Direct Memory Access (DMA) engines to implement the standard. However, each manufacturer’s DMA engines do not share a standard interface, leading to the same difficulties switching between boards or FPGAs.

The goal of this project is to reach a common interface for the diverse set of available DMA engines to improve application development productivity, particularly for the Autopipe environment, an environment for developing multi-device streaming applications.
Building a Skeleton of a Human Hand Using Microsoft KinectUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Jed Jackoway
5/1/2012 Reports/Attachments/988/Jackoway_Project.pdf
The goal of the project was to reconstruct the skeleton of a Microsoft Kinect user’s hand. Out of the box, Kinect reconstruct the skeleton of users’ bodies, but it only does large joints, such that the hand is given a location on the general skeleton, but the specifics of the fingers and fist are not actually calculated.
Kinect Hand Recognition and TrackingUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Alex Drake
5/1/2012 Reports/Attachments/987/Kinect Hand Recognition and Tracking.pdf
The goal of this research project was to be able to identify and track a hand using the depth image from a Microsoft Kinect.  The ability to do this would have uses in sign language recognition, rehabilitation, and gesture recognition amongst others.  The Microsoft Kinect is currently capable of identifying and tracking whole bodies.  Our approach was to follow the method that Microsoft used but apply it to detailed hand recognition instead of the body.  The basic strategy they used was to take a large amount of labeled depth data and build a decision tree to identify which part of the body each pixel of an image belongs to.
Practical Approaches to Biological Network DiscoveryUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Brian Haynes
5/1/2012 Reports/Attachments/986/BrianHaynes.pdf
This dissertation addresses a current outstanding problem in the field of systems biology, which is to identify the structure of a transcriptional network from high-throughput experimental data. Understanding of the connectivity of a transcriptional network is an important piece of the puzzle, which relates the genotype of an organism to its phenotypes. An overwhelming number of computational approaches have been proposed to perform integrative analyses on large collections of high-throughput gene expression datasets to infer the structure of transcriptional networks. I put forth a methodology by which these tools can be evaluated and compared against one another to better understand their strengths and weaknesses. Next I undertake the task of utilizing high-throughput datasets to learn new and interesting network biology in the pathogenic fungus Cryptococcus neoformans. Finally I propose a novel computational method for mapping out transcriptional networks that unifies two orthogonal strategies for network inference. I apply this method to map out the transcriptional network of Saccharomyces cerevisiae and demonstrate how network inference results can complement chromatin immunoprecipitation (ChIP) experiments, which directly probe the binding events of transcriptional regulators. Collectively, my contributions improve both the accessibility and practicality of network inference methods.
YouponUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Garrison Prinslow
4/30/2012 Reports/Attachments/985/GarrisonPrinslow_MEngCSE_MastersProject.pdf
This project was motivated by two related ideas: what can be learned about the unique issues involved with developing cloud-based mobile applications; and, what application could be developed to evaluate these characteristics that would also be innovative and provide value to users. After vetting ideas for the latter objective, it was clear that a new digital coupon system could offer value and apply interesting technologies to computer science problems, such as shortest path calculation, position-aware authentication, game theory, and statistical reasoning. A brief summary of the motivation for developing the new digital coupon system follows.
Early Warning System: Relay Sensor Deployment & Network Reliability AnalysisUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Zhicheng Yang
4/27/2012 Reports/Attachments/984/Yang_MastersProject.pdf
In this project, we continued Dr. Chipara's study, which developed an early warning system (EWS) to detect the vital signs of patients in order to help doctors to intervene in time [1]. Since the number of wards increased, the environment our system faced with became more complicated and our network became more sensitive. This project focused on finding reasons on the relays that didn't work and doing a reliability analysis on the network in one ward our study covered.
Correction of an Augmentation Bound Analysis for Parallel Real-Time TasksUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah,   Kunal Agrawal, Chenyang Lu, Christopher Gill
4/9/2012 Reports/Attachments/983/Notes-new-rtsj.pdf
This paper proposes some significant corrections in a recent work of Lakshmanan et al on parallel task scheduling. Lakshmanan et al have proposed a transformation of parallel tasks into sequential tasks, and have claimed a resource augmentation bound of 3:42 for partitioned deadline monotonic (DM) scheduling of the transformed tasks. We demonstrate that their analysis for resource augmentation bound is incorrect. We propose a different technique for task transformation that requires a resource augmentation bound of 5 for partitioned DM scheduling.
High Speed Networking in the Multi-Core EraUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Benjamin Wun
3/27/2012 Reports/Attachments/981/Wun_thesis.pdf
High speed networking is a demanding task that has traditionally been performed in dedicated, purpose built hardware or specialized network processors. These platforms sacrifice flexibility or programmability in favor of performance. Recently, there has been much interest in using multi-core general purpose processors, which have the advantages of being easily programmable and upgradeable. The best way to exploit these new architectures for networking is an open question that has been the subject of much recent research. In this disseration, I explore the best way to exploit multicore general purpose processors for packet processing applications. This includes both new architectural organizations for the processors as well as changes to the systems software. I intend to demonstrate the efficacy of these techniques by using them to build an open and extensible network security and monitoring platform that can out perform existing solutions.
Optimal Control for Autonomous Motor BehaviorUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Tom Erez
3/27/2012 Reports/Attachments/980/Erez_thesis.pdf
This dissertation presents algorithms that allow robots to generate optimal behavior from first principles. Instead of hard-coding every desired behavior, we encode the task as a cost function, and use numerical optimization to find action sequences that can accomplish the task. Using the theoretical framework of optimal control, we develop methods for generating autonomous motor behavior in high-dimensional domains of legged locomotion.
We identify three foundational problems that limit the application of existing optimal control algorithms, and present guiding principles that address these issues. First, some traditional algorithms use global optimization, where every possible state is considered. This approach cannot be applied in continuous domains, where every additional mechanical degree of freedom exponentially increases the volume of state space. In order to sidestep this curse of dimensionality, we focus on trajectory optimization, which finds locally-optimal solutions while scaling only polynoimally with state dimensionality. Second, many algorithms of optimal control and reinforcement learning cannot be directly applied to continuous domains with contacts, due to the non-smooth dynamics. We present techniques of contact smoothing that enable the use of standard continuous optimization tools. Finally, domains of legged locomotion give rise to extremely non-convex optimization, which is riddled with local minima. This issue is addressed by using a shaping protocol (homotopy-continuation), whereby the solution of an easier variant of the problem is used as initial guess for a harder variant of the problem, gradually finding a solution to the original domain.
In dynamic environments, effective behavior requires feedback control, which provides appropriate reaction to changing circumstance. We present tools that generate adaptive autonomous behavior through local optimization, and scale well to domains with over 60 state-space dimensions. We present a series of experiments that demonstrate robust and reactive motor behavior in domains of increasing complexity. First, a multi-joint swimmer which can follow a moving target, withstand violent perturbations, and avoid moving obstacles. Second, a one-legged hopper that can maintain its gait through both perturbations and morphological alteration. And finally, two bipeds: a planar walker, and a 3D runner that uses its arms to balance its gait.
Utility-Aware Scheduling of Stochastic Real-Time SystemsUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Terry Tidwell
3/27/2012 Reports/Attachments/979/tidwellThesis.pdf
Time utility functions offer a reasonably general way to describe the complex timing constraints of real-time and cyber-physical systems. However, utility-aware scheduling policy design is an open research problem. In particular, scheduling policies that optimize expected utility accrual are needed for real-time and cyber-physical domains.
This dissertation addresses the problem of utility-aware scheduling for systems with periodic real-time task sets and stochastic non-preemptive execution intervals. We model these systems asMarkov Decision Processes. This model provides an evaluation framework by which different scheduling policies can be compared. By solving the Markov Decision Process we can derive value-optimal scheduling policies for moderate sized problems.
However, the time and memory complexity of computing and storing value-optimal scheduling policies also necessitates the exploration of other more scalable solutions. We consider heuristic schedulers, including a generalization we have developed for the existing Utility Accrual Packet Scheduling Algorithm. We compare several heuristics under soft and hard real-time conditions, different load conditions, and different classes of time utility functions. Based on these evaluations we present guidelines for which heuristics are best suited to particular scheduling criteria.
Finally, we address the memory complexity of value-optimal scheduling, and examine trade-offs between optimality and memory complexity. We show that it is possible to derive good low complexity scheduling decision functions based on a synthesis of heuristics and reduced-memory approximations of the value-optimal scheduling policy.
Accounting for Failures in Delay Analysis for WirelessHART NetworksUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah, Paras Babu Tiwari, Bo Li, Chenyang Lu, Yixin Chen
2/22/2012 Reports/Attachments/978/reliable_scheduling_1.pdf
WirelessHART networks are gaining ground as a real-time communication infrastructure in industrial wireless control systems. Because wireless communication is often susceptible to transmission failures in industrial environments, it is essential to account for failures in the delay analysis for realtime flows between sensors and actuators in process control. WirelessHART networks handle transmission failures through retransmissions using dedicated and shared time slots through different paths in the routing graphs. While these mechanisms for handling transmission failures are critical for process control requiring reliable communication, they introduce substantial challenges to worst-case end-to-end delay analysis for real-time flows. This paper presents the first worst-case end-to-end delay analysis for periodic real-time flows in a WirelessHART network that takes into account transmission failures. The delay bounds can be used to quickly assess the schedulability of real-time flows for industrial wireless control applications with stringent requirements on both high reliability and network latency. Simulations based on the topologies of a wireless sensor network testbed consisting of 69 TelosB motes indicate that our analysis provides safe upper bounds of the end-to-end delays of real-time flows at an acceptable level of pessimism.
Real-Time Scheduling of Parallel Tasks under a General DAG ModelUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah, David Ferry, Kunal Agrawal, Chenyang Lu, and Christopher Gill
2/22/2012 Reports/Attachments/977/dag_parallel_1.pdf
Due to their potential to deliver increased performance over single-core processors, multi-core processors have become mainstream in processor design. Computation-intensive real-time systems must exploit intra-task parallelism to take full advantage of multi-core processing. However, existing results in real-time scheduling of parallel tasks focus on restrictive task models such as the synchronous model where a task is a sequence of alternating parallel and sequential segments, and parallel segments have threads of execution that are of equal length. In this paper, we address a general model for deterministic parallel tasks, where a task is represented as a DAG with different nodes having different execution requirements. We make several key contributions towards both preemptive and non-preemptive realtime scheduling of DAG tasks on multi-core processors. First, we propose a task decomposition that splits a DAG into sequential tasks. Second, we prove that parallel tasks, upon decomposition, can be scheduled using preemptive global EDF with a resource augmentation bound of 4. This bound is as good as the best known bound for more restrictive models, and is the first for a general DAG model. Third, we prove that the decomposition has a resource augmentation bound of 4 plus a non-preemption overhead for non-preemptive global EDF scheduling. To our knowledge, this is the first resource augmentation bound for nonpreemptive scheduling of parallel tasks. Through simulations, we demonstrate that the achieved bounds are safe and sufficient.
An Integrated Data Mining Approach to Real-time Clinical Monitoring and Deterioration WarningUse SHIFT+ENTER to open the menu (new window).
Yi Mao, Wenlin Chen, Yixin Chen, Chenyang Lu, Marin Kollef, Thomas Bailey
2/21/2012 Reports/Attachments/976/sigproc-sp.pdf
Clinical study found that early detection and intervention are essential for preventing clinical deterioration in patients, for patients both in intensive care units (ICU) as well as in general wards but under real-time data sensing (RDS). In this paper, we develop an integrated data mining approach to give early deterioration warnings for patients under real-time monitoring in ICU and RDS.

Existing work on mining real-time clinical data often focus on certain single vital sign and specific disease. In this paper, we consider an integrated data mining approach for general sudden deterioration warning. We synthesize a large feature set that includes first and second order time-series features, detrended fluctuation analysis (DFA), spectral analysis, approximative entropy, and cross-signal features. We then systematically apply and evaluate a series of established data mining methods, including forward feature selection, linear and nonlinear classification algorithms, and exploratory undersampling for class imbalance.

An extensive empirical study is conducted on real patient data collected between 2001 and 2008 from a variety of ICUs. Results show the benefit of each of the proposed techniques, and the final integrated approach significantly improves the prediction quality. The proposed clinical warning system is currently under integration with the electronic medical record system at Barnes-Jewish Hospital in preparation for a clinical trial. This work represents a promising step toward general early clinical warning which has the potential to significantly improve the quality of patient care in hospitals.
Just Draw It! A 3D Sketching SystemUse SHIFT+ENTER to open the menu (new window).
Cindy Grimm and Pushkar Joshi
1/9/2012 Reports/Attachments/975/JustDrawIt_SBIM2012_sub.pdf
We present a system for sketching in 2D to create 3D curves. The interface is light-weight, pen-based, and based on observations of how artists sketch on paper.
RIDE: A Mixed-Mode Control Interface for Mobile Robot TeamsUse SHIFT+ENTER to open the menu (new window).
Erik Karulf, Marshall Strother, Parker Dunton, and William D. Smart
1/7/2012 Reports/Attachments/974/ride-tr.pdf
There is a growing need for robot control interfaces that allow a single user to effectively control a large number of mostly-autonomous robots.  The challenges in controlling such a collection of robots are very similar to the challenges of controlling characters in some genres of video games.  In this paper, we argue that interfaces based on elements from computer video games are effective tools for the control of large robot teams.  We present RIDE, the Robot Interactive Display Environment, an example of such an interface, and give the results of initial user studies with the interface, which lend support to our claim.
The Forest Overlay NetworkUse SHIFT+ENTER to open the menu (new window).
Logan Stafman
12/23/2011 Reports/Attachments/970/Forest.pdf
Forest is an overlay network designed for large real-time distributed systems.  In particular, we’re interested in virtual worlds that provide high-quality interaction over a constantly changing pattern of communication.  Forest is suitable for applications in which many entities send data to a large set of constantly changing entities.  By using tree-structured channels, we can create logically isolated private networks which support both unicast and multicast routing.  In this paper, we will discuss the core components of Forest and measure the performance of the control elements of the network. We will also provide examples of control sequences and the roles played by control elements in those sequences to help maintain fast, reliable data delivery.
A Survey on Communication Networks in Emergency Warning SystemsUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Yan Li
12/21/2011 Reports/Attachments/969/YanLi_project.pdf
Real Time Baseball Augmented RealityUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Adam Kraft
12/6/2011 Reports/Attachments/968/Kraft_Baseball_AR_1.pdf
As cellular phones grow faster and become equipped with better sensors, consumers are seeing a rise in augmented reality applications available on their mobile devices. Sports augmented reality is one area that is projected to grow over the next couple of years. We aim to build an application intended for use at a live baseball game. Our application optimizes a best-fit homography to locate the structure of the playing field. From there, we are able to visualize information and statistics directly onto the user's mobile device. This is accomplished in real time, requiring minimal input from the user.
2-Edge-Connectivity and 2-Vertex-Connectivity with Fault ContainmentUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah
11/11/2011 Reports/Attachments/967/conn_paper.pdf
Self-stabilization for non-masking fault-tolerant distributed system has received considerable research interest over the last decade. In this paper, we propose a self-stabilizing algorithm for 2-edge-connectivity and 2-vertex-connectivity of an asynchronous distributed computer network. It is based on a self-stabilizing depth-first search, and is not a composite algorithm in the sense  that it is not composed of a number of self-stabilizing algorithms that run concurrently. The time and space complexities of the algorithm are the same as those of the underlying self-stabilizing depth-first search algorithm.
Modeling and Dynamic Resource Allocation for High Definition and Mobile Video StreamsUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Abdel-Karim Al-Tamimi
8/1/2010 Reports/Attachments/966/abdel.pdf
Video streaming traffic has been surging in the last few years, which has resulted in an increase of its Internet traffic share on a daily basis. The importance of video streaming management has been emphasized with the advent of High Definition (HD) video streaming, as it requires by its nature more network resources.

In this dissertation, we provide a better support for managing HD video traffic over both wireless and wired networks through several contributions. We present a simple, general and accurate video source model: Simplified Seasonal ARIMA Model (SAM). SAM is capable of capturing the statistical characteristics of video traces with less than 5% difference from their calculated optimal models. SAM is shown to be capable of modeling video traces encoded with MPEG-4 Part2, MPEG-4 Part10, and Scalable Video Codec (SVC) standards, using various encoding settings.

We also provide a large and publicly-available collection of HD video traces along with their analyses results. These analyses include a full statistical analysis of HD videos, in addition to modeling, factor and cluster analyses. These results show that by using SAM, we can achieve up to 50% improvement in video traffic prediction accuracy. In addition, we developed several video tools, including an HD video traffic generator based on our model. Finally, to improve HD video streaming resource management, we present a SAM-based delay-guaranteed dynamic resource allocation (DRA) scheme that can provide up to 32.4% improvement in bandwidth utilization.
Near Optimal Rate Selection for Wireless Control SystemsUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah, Chengjie Wu, Paras Babu Tiwari, You Xu, Yong Fu, Chenyang Lu, and Yixin Chen
10/17/2011 Reports/Attachments/965/Control_Optimization1.pdf
With the advent of industrial  standards such as WirelessHART, process industries are now gravitating towards  wireless control systems.  Due to limited  bandwidth in a wireless network shared by multiple control loops,  it is critical  to optimize the overall control performance.  In this paper, we address the scheduling-control co-design problem   of determining the optimal sampling rates of   feedback control loops sharing a WirelessHART network.  The objective is to minimize the overall control cost  while  ensuring that all data flows  meet their end-to-end deadlines.  The resulting constrained optimization based on existing delay bounds for WirelessHART networks   is  challenging  since it is non-differentiable, non-linear, and not in closed-form.  We  propose four methods to solve this problem. First, we present  a subgradient method for rate selection. Second, we propose a greedy heuristic  that usually achieves low control cost while significantly reducing the execution time.   Third, we propose a global constrained  optimization algorithm using a simulated annealing (SA) based penalty method.   Finally, we formulate rate selection  as a differentiable convex optimization problem that provides  a closed-form solution through a gradient descent method.  This is based on a new delay bound that is convex and differentiable, and hence simplifies the optimization problem.  We evaluate all methods through simulations based on  topologies of a $74$-node wireless sensor network testbed. Surprisingly, the subgradient method is disposed to incur the longest execution time as well as the highest control cost among all methods. SA and the greedy heuristic represent the opposite ends of the tradeoff between control cost and execution  time, while the gradient descent method hits the balance between the two.
Implementation and Evaluation of Mixed-Criticality Scheduling Approaches for Periodic TasksUse SHIFT+ENTER to open the menu (new window).
Huang-Ming Huang, Christopher Gill, Chenyang Lu
10/14/2011 Reports/Attachments/964/mc_tech_report2.pdf
Traditional fixed-priority scheduling analysis for periodic task sets is based on the assumption that all tasks are equally critical to the correct operation of the system. Therefore, every task has to be schedulable under the scheduling policy, and estimates of tasks’ worst case execution times must be conservative in case a task runs longer than is usual. To address the significant under-utilization of a system’s resources under normal operating conditions that can arise from these assumptions, three main approaches have been proposed: priority assignment, period transformation, and zero-slack scheduling. However, to date there has been no quantitative comparison of system schedulability or run-time overhead for the different approaches. In this paper, we present what is to our knowledge the first side-by-side evaluation of those approaches, for periodic mixed-criticality tasks on uniprocessor systems, under a mixed-criticality scheduling model that is common to all three approaches. To make a fair evaluation of zero-slack scheduling, we also address two previously open issues: how to accommodate execution of a task after its deadline, and how to account for previously unidentified forms of interference between mixed-criticality tasks. Our simulations show that while priority assignment and period transformation are most likely to be able to schedule a randomly selected task set, a small fraction of the task sets are schedulable only under the zero-slack approach. Our empirical evaluation demonstrates that user-space implementations of mechanisms to enforce period transformation and zero-slack scheduling can be achieved on Linux without kernel modification, with suitably low overhead for mixed-criticality real-time task sets.
Low-Impact Profiling of Streaming, Heterogeneous ApplicationsUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Joseph Lancaster
10/3/2011 Reports/Attachments/963/thesis-Lancaster.pdf
Computer engineers are continually faced with the task of translating improvements in fabrication process technology (i.e., Moore’s Law) into architectures that allow computer scientists to accelerate application performance. As feature-size continues to shrink, architects of commodity processors are designing increasingly more cores on a chip. While additional cores can operate independently with some tasks (e.g. the OS and user tasks), many applications see little to no improvement from adding more processor cores alone.

For many applications, heterogeneous systems offer a path toward higher performance. Significant performance and power gains have been realized by combining specialized processors (e.g., Field-Programmable Gate Arrays, Graphics Processing Units) with general purpose multi-core processors. Heterogeneous applications need to be programmed differently than traditional software. One approach, stream processing, fits these systems particularly well because of the segmented memories and explicit expression of parallelism. Unfortunately, debugging and performance tools that support streaming, heterogeneous applications do not exist.

This dissertation presents TimeTrial, a performance measurement system that enables performance optimization of streaming applications by profiling the application deployed on a heterogeneous system. TimeTrial performs low-impact measurements by dedicating computing resources to monitoring and by aggressively compressing performance traces into statistical summaries guided by user specification of the performance queries of interest.
End-to-End Communication Delay Analysis in WirelessHART NetworksUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah, You Xu, Chenyang Lu, and Yixin Chen
9/29/2011 Reports/Attachments/962/NEW_SCHEDULABILITY.pdf
WirelessHART is a new standard specifically designed for real-time and reliable communication between sensor and actuator devices for industrial process monitoring and control applications. End-to-end communication delay analysis for WirelessHART networks is required to determine the schedulability of real-time data flows from sensors to actuators for the purpose of acceptance test or workload adjustment in response to network dynamics. In this paper, we map the scheduling of real-time periodic data flows in a WirelessHART network to real-time multiprocessor scheduling. We then exploit the response time analysis for multiprocessor scheduling and propose a novel method for the delay analysis that establishes an upper bound of the end-to-end communication delay of each real-time flow in a WirelessHART network. Simulation studies based on both random topologies and real network topologies of a 74-node physical wireless sensor network testbed demonstrate that our analysis provides safe and reasonably tight upper bounds of the end-to-end delays of real-time flows, and hence enables effective schedulability tests for WirelessHART networks.
Practical and Robust Power Management for Wireless Sensor NetworksUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Gregory Hackmann
9/20/2011 Reports/Attachments/961/Hackmann.pdf
Wireless Sensor Networks (WSNs) consist of tens or hundreds of small, inexpensive computers equipped with sensors and wireless communication capabilities. Because WSNs can be deployed without xed infrastructure, they promise to enable sensing applications in environments where installing such infrastructure is not feasible. However, the lack of xed infrastructure also presents a key challenge for application developers: sensor nodes must often operate for months or years at a time from fixed or limited energy sources. The focus of this dissertation is on reusable power management techniques designed to facilitate sensor network developers in achieving their systems' required lifetimes.

Broadly speaking, power management techniques fall into two categories. Many power management protocols developed within the WSN community target specific hardware subsystems in isolation, such as sensor or radio hardware. The first part of this dissertation describes the Adaptive and Robust Topology control protocol (ART), a representative hardware-speci c technique for conserving energy used by packet transmissions.

In addition to these single-subsystem approaches, many applications can benefit greatly from holistic power management techniques that jointly consider the sensing, computation, and communication costs of potential application configurations. The second part of this dissertation extends this holistic power management approach to two families of structural health monitoring applications. By applying a partially-decentralized architecture, the cost of collecting vibration data for analysis at a centralized base station is greatly reduced.

Finally, the last part of this dissertation discusses work toward a system for clinical early warning and intervention. The feasibility of this approach is demonstrated through preliminary study of an early warning component based on historical clinical data. An ongoing clinical trial of a real-time monitoring component also provides important guidelines for future clinical deployments based on WSNs.
Optimization of Gene Prediction via More Accurate Phylogenetic Substitution ModelsUse SHIFT+ENTER to open the menu (new window).
Ezekiel Maier, Randall H Brown, and Michael R Brent

9/9/2011 Reports/Attachments/960/Tech_Report_6-6_ejm.pdf
Determining the beginning and end positions of each exon in each protein coding gene within a genome can be difficult because the DNA patterns that signal a gene’s presence have multiple weakly related alternate forms and the DNA fragments that comprise a gene are generally small in comparison to the size of the genome. In response to this challenge, automated gene predictors were created to generate putative gene structures.  N SCAN identifies gene structures in a target DNA sequence and can use conservation patterns learned from alignments between a target and one or more informant DNA sequences. N SCAN uses a Bayesian network, generated from a phylogenetic tree, to probabilistically relate the target sequence to the aligned sequence(s).  Phylogenetic substitution models are used to estimate substitution likelihood along the branches of the tree.  
     Although N SCAN’s predictive accuracy is already a benchmark for de novo HMM based gene predictors, optimizing its use of substitution models will allow for improved conservation pattern estimates leading to even better accuracy. Selecting optimal substitution models requires avoiding overfitting as more detailed models require more free parameters; unfortunately, the number of parameters is limited by the number of known genes available for parameter estimation (training). In order to optimize substitution model selection, we tested eight models on the entire genome including General, Reversible, HKY, Jukes-Cantor, and Kimura. In addition to testing models on the entire genome, genome feature based model selection strategies were investigated by assessing the ability of each model to accurately reflex the unique conservation patterns present in each genome region. Context dependency was examined using zeroth, first, and second order models. All models were tested on the human and D. melanogaster genomes.  Analysis of the data suggests that the nucleotide equilibrium frequency assumption (denoted as πi) is the strongest predictor of a model’s accuracy, followed by reversibility and transition/transversion inequality.  Furthermore, second order models are shown to give an average of 0.6% improvement over first order models, which give an 18% improvement over zeroth order models. Finally, by limiting parameter usage by the number of training examples available for each feature, genome feature based model selection better estimates substitution likelihood leading to a significant improvement in N SCAN’s gene annotation accuracy.  

Hierarchical Scheduling for Multicores with Multilevel Cache HierarchiesUse SHIFT+ENTER to open the menu (new window).
Kunal Agrawal and Jim Sukha
8/9/2011 Reports/Attachments/959/hws.pdf
Cache-locality is an important consideration for the performance in multicore systems. In modern
and future multicore systems with multilevel cache hierarchies, caches may be arranged in a tree of
caches, where a level k cache is shared between Pk processors, called a processor group, and Pk increases
with k. In order to get good performance, as much as possible, subcomputations that share more data
should execute on processors which share a lower-level cache. Therefore, the number of cache misses in
these systems depends on the scheduling decisions, and a scheduler is responsible for not just achieving
good load-balance and low overheads, but also good cache complexity. However, these can be competing
criteria. In this paper, we explore the tension between these criteria for online hierarchical schedulers.
Formally, we consider a system with P processors, arranged in a multilevel hierarchy according to
a hierarchy tree, where each of the P processors forms a leaf of the tree, and an internal node at level-k
corresponds corresponds to a processor group. In addition, we assume that computations have locality
regions, that represent parallel subcomputations that share data. Each locality region has a particular
level, and the scheduler must ensure that a level-k locality region is executed by processors in the same
level-k processor group, since they share a level k cache. Thus locality regions can improve cache
performance. However, they may also impair load-balance and increase scheduling overheads since the
scheduler must obey the restrictions posed by locality regions.
In this paper, we present a framework of hierarchical computations, that is, computations with locality
regions at multiple levels of nesting. We describe the hierarchical greedy scheduler, where each
locality region is scheduled using a greedy scheduler which attempts to use as many processors as possible
while obeying the restrictions posed by the locality regions. We derive a recurrence for the time
complexity for a region in terms of its nested regions. We also describe how a more realistic hierarchical
work-stealing scheduler can get the same bounds apart from constant factors for an important subclass
of computations called homogenous computations. Finally, we also analyze the cache complexity of
the hierarchical work-stealing scheduler for a system with a multilevel cache hierarchy.
Distributed Channel Allocation Algorithms for Wireless Sensor NetworksUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah,  You Xu,   Chenyang Lu,   and Yixin Chen
7/31/2011 Reports/Attachments/958/Channel_WSN.pdf
Interference between concurrent transmissions can cause severe performance degradation in wireless sensor networks (WSNs). While multiple channels available in WSN technology such as IEEE 802.15.4 can be exploited to mitigate interference, channel allocation can have a significant impact on the performance of multi-channel communication. This paper proposes a set of distributed algorithms for near-optimal channel allocation in WSNs with theoretical bounds.  We first consider the problem of minimizing the number of channels needed to remove interference in a WSN, and propose both receiver-based and link-based distributed channel allocation protocols. For WSNs with an insufficient number of channels, we formulate a fair channel allocation problem whose objective is to minimize the maximum  interference (MinMax) experienced by any transmission link in the network. We prove that MinMax channel allocation is NP-hard and propose a  distributed link-based MinMax channel allocation protocol. We also propose a distributed protocol for link scheduling based on MinMax channel allocation. Simulations based on  real topologies  and data traces collected from a WSN testbed consisting of  74 TelosB motes, and using random topologies  have shown that our channel allocation protocols significantly outperform  a state-of-the-art channel allocation protocol.
Energy-Efficient Low Power Listening for Wireless Sensor Networks in Noisy EnvironmentsUse SHIFT+ENTER to open the menu (new window).
Mo Sha, Gregory Hackmann, Chenyang Lu
4/9/2012 Reports/Attachments/957/aedp.pdf
Low Power Listening (LPL) is a common MAC-layer technique for reducing energy consumption in wireless sensor networks, where nodes periodically wake up to sample the wireless channel to detect activity. However, LPL is highly susceptible to false wakeups caused by environmental noise being detected as activity on the channel, causing nodes to spuriously wake up in order to receive nonexistent transmissions. In empirical studies in residential environments, we observe that the false wakeup problem can significantly increase a node's duty cycle, compromising the benefit of LPL. We also find that the energy-level threshold used by the Clear Channel Assessment (CCA) mechanism to detect channel activity has a significant impact on the false wakeup rate. We then design AEDP, an adaptive energy detection protocol for LPL, which dynamically adjust a node's CCA threshold to improve network reliability and duty cycle based on application-specified bounds. Empirical experiments in both controlled tests and real-world environments showed AEDP can effectively mitigate the impact of noise on radio duty cycles, while maintaining satisfactory link reliability.
Efficient Deadlock Avoidance for Streaming Computation with FilteringUse SHIFT+ENTER to open the menu (new window).
Jeremy Buhler, Kunal Agrawal, Peng Li, Roger D. Chamberlain
7/24/2011 Reports/Attachments/956/balc11.pdf
In this report, we show that deadlock avoidance for
streaming computations with filtering can be performed efficiently for a large class of DAG topologies. We first give efficient algorithms for dummy interval computation in series-parallel DAGs, then generalize our results to a larger graph family, the CS4DAGs, in which every undirected cycle has exactly one source and one sink.  Our results show that, for a large set of application
topologies that are both intuitively useful and formalizable, the
streaming model with filtering can be implemented safely with
reasonable compilation overhead.
Results of an observational study on sketchingUse SHIFT+ENTER to open the menu (new window).
Cindy Grimm
6/10/2011 Reports/Attachments/955/SketchStudyTechReport.pdf
We present the results of an observational study on sketching. Artists were asked to sketch a small number of objects and comment on how and why they made the marks they did. We summarize these findings, from low-level details on individual marks through the drawing construction order. Based on these observations we provide suggestions for future research directions in 3D sketching.
Drawing study description
NS-3 SIMULATION OF WIMAX NETWORKSUse SHIFT+ENTER to open the menu (new window).
MS Thesis
Christopher Thomas
5/25/2011 Reports/Attachments/954/NS-3 Simulation of WiMAX Networks.pdf
Simulation is a powerful tool for analysis and improvement of networking technologies, and many simulation packages are available.  One that is growing in popularity is NS-3, the successor to the popular NS-2.  It is a significant departure from NS-2, and offers many advantages and disadvantages.  In this thesis, we translate and update a sophisticated WiMAX simulation model from NS-2 to NS-3, and use this experience to investigate the major differences between NS-2 and NS-3, and the relative strengths of each package.  We then use the NS-3 simulation model to provide analysis on a new WiMAX OFDMA downlink subframe mapping algorithm.
Advisor: Raj Jain
Mercury BLASTN Biosequence Similarity Search System: Technical Reference GuideUse SHIFT+ENTER to open the menu (new window).
Jeremy Buhler
5/16/2011 Reports/Attachments/953/gblast-reference-guide[1].pdf
This guide documents the operation of the Mercury BLASTN system for hardware-accelerated DNA similarity search. It includes detailed information on the syntax and limitations of the system's component commands, as well as a description of the system's hardware platform suitable for administrators who need to maintain a Mercury BLASTN system.

Mercury BLASTN is a product of the High Performance COmputational Biology Group at Washington University.
Homepage of the High Performance Computational Biology Group
Clinical Interpretation of Novel Copy Number VariationsUse SHIFT+ENTER to open the menu (new window).
MS Thesis
Clifton M. Carey, Jr.
5/3/2011 Reports/Attachments/951/Carey_thesis.pdf
Copy Number Variations (CNVs) are a significant source of human genetic diversity and are believed to be responsible for a wide variety of phenotypic variation. Recent advances in microarray-based genomic hybridization techniques have facilitated CNV analysis as a viable diagnostic technique in the clinic, and several public databases of well-characterized CNVs are being compiled, but a standard for interpreting uncharacterized CNVs has yet to emerge.

This thesis examines the clinical interpretation of uncharacterized CNVs as a multiple instance binary classification problem. We analyze the current state of clinical techniques, then present and test a novel, statistical approach to the problem.
Simplifying the Non-Manifold Topology of Multi-Partitioning Surface NetworksUse SHIFT+ENTER to open the menu (new window).
MS Thesis
Trung Duc Nguyen
5/3/2011 Reports/Attachments/950/Trung-Duc-Nguyen-MS-thesis.pdf
In bio-medical imaging, multi-partitioning surface networks (MPSNs) are very useful to model complex organs with multiple anatomical regions, such as a mouse brain. However, MPSNs are usually constructed from image data and might contain complex geometric and topological features. There has been much research on reducing the geometric complexity of a general surface (non-manifold or not) and the topological complexity of a closed, manifold surface. But there has been no attempt so far to reduce redundant topological features which are unique to non-manifold surfaces, such as curves and points where multiple sheets of surfaces join.

In this thesis, we design interactive and automated means for removing redundant non-manifold topological features in MPSNs, which is a special class of non-manifold surfaces. The core of our approach is a mesh surgery operator that can effectively simplify the non-manifold topology while preserving the validity of the MPSN. The operator is implemented in an interactive user interface, allowing user-guided simplification of the input. We further develop an automatic algorithm that invokes the operator following a greedy heuristic. The algorithm is based on a novel, abstract representation of a non-manifold surface as a graph, which allows efficient discovery and scoring of possible surgery operations without the need for explicitly performing the surgeries on the mesh geometry.
Webcam Image AlignmentUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Matthew Klein
5/2/2011 Reports/Attachments/949/KleinMatthew_project.pdf
Multi-core Real-Time Scheduling for Generalized Parallel Task ModelsUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah  and   Kunal Agrawal
3/21/2011 Reports/Attachments/948/RTSS2011_Saifullah.pdf
Multi-core processors over a significant performance increase over single-core processors. Therefore, they have the potential to enable computation-intensive real-time applications with stringent timing constraints that cannot be met on traditional single-core processors. However, most results in traditional multiprocessor real-time scheduling are limited to sequential programming models and ignore intra-task parallelism. In this paper, we address the problem of scheduling periodic parallel tasks with implicit deadlines on multi-core processors. We first consider a synchronous task model where each task consists of segments, each segment having an arbitrary number of parallel threads that synchronize at the end of the segment. We propose a new task decomposition method that decomposes each parallel task into a set of sequential tasks. We prove that our task decomposition achieves a resource augmentation bound of 4 and 5 when the decomposed tasks are scheduled using global EDF and partitioned deadline monotonic scheduling, respectively. Finally, we extend our analysis to a directed acyclic graph (DAG) task model where each node in the DAG has a unit execution requirement. We show how these tasks can be converted into synchronous tasks such that the same decomposition can be applied and the same augmentation bounds hold. Simulations based on synthetic workload demonstrate that the derived resource augmentation bounds are safe and sufficient.
Generating Muscle Driven Arm Movements Using Reinforcement LearningUse SHIFT+ENTER to open the menu (new window).
MS Project Report
Alex S. Broad
3/7/2011 Reports/Attachments/946/Broad_Alexander_MSProject.pdf
This project focuses on using Reinforcement Learning to optimize arm dynamics through muscle control for desired trajectories. The goal of this project was to create a tool that can be used to gain a better understanding of the arm’s muscles and collect information that is useful in many other disciplines such as biomechanics, anthropology, medicine and robotics. I developed biologically realistic models of primate arm's using Stanford’s SimTK software, an open-source tool for modeling musculoskeletal structures. I then made use of Differential Dynamic Programming in order to generate novel movements from first principles and optimize motion over a specified trajectory. Lastly, I demonstrated the usefulness of this tool by examining its effectiveness in discovering the consequences of different muscle groups on the optimal behavior policy.
Flux Balance Analysis of Dynamic Metabolism in Shewanella oneidensis MR-1 Using a Static Optimization ApproachUse SHIFT+ENTER to open the menu (new window).
Xueyang Feng, You Xu, Yixin Chen, Yinjie Tang
2/21/2011 Reports/Attachments/945/DFBA-1.pdf
Shewanella bacteria are facultative anaerobes isolated from aquatic and sedimentary environments (Hau and Gralnick 2007) with a broad capacity for reduction of multiple electron receptors (Pinchuk et al. 2009; Serres and Riley 2006), including Fe(III), Mn(IV), sulfur, nitrate, and fumarate. With the accomplishment of complete genome sequencing of several Shewanella bacteria, the general pictures of the carbon metabolism have been revealed (Serres and Riley 2006). metabolism.

One of the most physiological methods to decipher the time-variant metabolic regulation is to determine the dynamic distribution of intracellular metabolic fluxes since it reveals the final response of cellular metabolism to genomic, transcriptional and post-transcriptional regulations (Sauer 2006; Tang et al. 2009). In order to track the dynamic intracellular metabolic regulation, dynamic flux balance analysis (DFBA) was developed (Mahadevan et al. 2002), in which cell growth phase was divided into numerous stages, assuming that at each stage a new metabolic steady state was maintained. All the metabolic fluxes were then searched to satisfy the objective functions set for each stage. By solving this nonlinear optimization model using a cutting-edge nonlinear optimization solver (IPOPT), we confirmed the changing of carbon sources for the growth of Shewanella oneidensis MR-1 and deciphered the dynamic regulation of intracellular metabolism.
Feedback Thermal Control of Real-time Systems on Multicore ProcessorsUse SHIFT+ENTER to open the menu (new window).
Yong Fu¤, Nicholas Kottenstette†, Chenyang Lu¤, Xenofon D. Koutsoukos
1/25/2011 Reports/Attachments/942/tbmc_press.pdf
Real-time systems face significant challenges in thermal management with their adoption of modern multicore processors. While earlier research on feedback thermal control has shown promise in dealing with the uncertainties in the thermal characteristics, multicore processors introduce new challenges that cannot be handled by previous solutions designed for single-core processors. Multicore processors require the temperatures and real-time performance of multiple cores to be controlled simultaneously, leading to multi-input-multi-output (MIMO) control problems with inter-core thermal coupling. Furthermore, current Dynamic Voltage and Frequency Scaling (DVFS) mechanisms only support a finite set of states, leading to discrete control variables that cannot be handled by standard linear control techniques. This paper presents Real-Time Multicore Thermal Control (RT-MTC), the first feedback thermal control framework specifically designed for multicore real-time systems. RT-MTC dynamically enforces both the temperature and the CPU utilization bounds of a multicore processor through DVFS with discrete frequencies. RT-MTC employs a highly efficient controller that integrates saturation and proportional control components rigorously designed to enforce the desired core temperature and CPU utilization bounds. It handles discrete frequencies through a PulseWidth Modulation (PWM) that achieves effective thermal control by manipulating the dwelling time of discrete frequencies. As a result RT-MTC can achieve effective thermal control with only a small number of frequencies typical in current processors. The robustness and advantages of RTMTC over existing thermal control approaches are demonstrated through extensive simulations under a wide range of uncertainties in term of power consumption.
PhD Dissertation
1/18/2011 Reports/Attachments/941/thesis-main.pdf
The rapid growth of biosequence databases over the last decade has led to a performance bottleneck in the applications analyzing them. In particular, over the last five years DNA sequencing capacity of next-generation sequencers has been doubling every six months as costs have plummeted. The data produced by these sequencers is overwhelming traditional compute systems. We believe that in the future compute performance, not sequencing, will become the bottleneck in advancing genome science.

In this work, we investigate novel computing platforms to accelerate dynamic programming algorithms, which are popular in bioinformatics workloads. We study algorithm-specific hardware architectures that exploit fine-grained parallelism in dynamic programming kernels using field-programmable gate arrays (FPGAs). We advocate a high-level synthesis approach, using the recurrence equation abstraction to represent dynamic programming and polyhedral analysis to exploit parallelism.

We suggest a novel technique within the polyhedral model to optimize for throughput by pipelining independent computations on an array. This design technique improves on the state of the art, which builds latency-optimal arrays. We also suggest a method to dynamically switch between a family of designs using FPGA reconfiguration to achieve a significant performance boost. We have used polyhedral methods to parallelize the Nussinov RNA folding algorithm to build a family of accelerators that can trade resources for parallelism and are between 15-130x faster than a modern dual core CPU implementation. A Zuker RNA folding accelerator we built on a single workstation with four Xilinx Virtex 4 FPGAs outperforms 198 3 GHz Intel Core 2 Duo processors. Furthermore, our design running on a single FPGA is an order of magnitude faster than competing implementations on similar-generation FPGAs and graphics processors.

Our work is a step toward the goal of automated synthesis of hardware accelerators for dynamic programming algorithms.
An Inexpensive Robot Platform for Teleoperation and ExperimentationUse SHIFT+ENTER to open the menu (new window).
Daniel A. Lazewatsky and William D. Smart
12/20/2010 Reports/Attachments/940/wu_telepresence.pdf
Most commercially-available robots are either aimed at the research community, or are designed with a single purpose in mind.  The extensive hobbyist community has tended to focus on the hardware and the low-level software aspects.  We claim that there is a need for a low-cost, general-purpose robot, accessible to the hobbyist community, with sufficient computation and sensing to run ``research-grade'' software.  In this paper, we describe the design and implementation of such a robot.  We explicitly outline our design goals, and show how a capable robot can be assembled from off-the-shelf parts, for a modest cost, by a single person with only a few tools.  We also show how the robot can be used as a low-cost telepresence platform, giving the system a concrete purpose beyond being a low-cost development platform.
An Empirical Analysis on Point-Wise Machine Learning Techniques Using Regression Trees for Web-Search RankingUse SHIFT+ENTER to open the menu (new window).
MS Thesis
Ananth Mohan
12/10/2010 Reports/Attachments/939/AnanthMohan_thesis.pdf
Learning how to rank a set of objects relative to an user defined query has received much interest in the machine learning community during the past decade. In fact, there have been two recent competitions hosted by internationally prominent search companies to encourage research on ranking web site documents. Recent literature on learning to rank has focused on three approaches: point-wise, pair-wise, and list-wise. Many different kinds of classifiers, including boosted decision trees, neural networks, and SVMs have proven successful in the field. This thesis surveys traditional point-wise techniques that use regression trees for web-search ranking. The thesis contains empirical studies on Random Forests and Gradient Boosted Decision Trees, with novel augmentations to them on real world data sets. We also analyze how these point-wise techniques perform on new areas of research for web-search ranking: transfer learning and feature-cost aware models.
Toward a Two-Tier Clinical Warning System for Hospitalized PatientsUse SHIFT+ENTER to open the menu (new window).
Gregory Hackmann, Minmin Chen, Octav Chipara, Chenyang Lu, Yixin Chen, Marin Kollef, Thomas C. Bailey
11/9/2010 Reports/Attachments/938/amia11.pdf
Clinical study has found early detection and intervention to be essential for preventing clinical deterioration in patients at general hospital units.  In this paper, we envision a two-tiered early warning system designed to identify the signs of clinical deterioration and provide early warning of serious clinical events.  The first tier of the system automatically identifies patients at risk of clinical deterioration from existing electronic medical record databases.  The second tier performs real-time clinical event detection based on real-time vital sign data collected from on-body wireless sensors attached to those high-risk patients.  We employ machine-learning techniques to analyze data from both tiers, assigning scores to patients in real time.  The assigned scores can then be used to trigger early-intervention alerts.  Preliminary study of an early warning system component and a wireless clinical monitoring system component demonstrate the feasibility of this two-tiered approach.
Priority Assignment for Real-Time Flows in WirelessHART Sensor-Actuator NetworksUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah, You Xu, Chenyang Lu, and Yixin Chen
11/3/2010 Reports/Attachments/937/Priority_Assignment.pdf
Recent years have witnessed the adoption of wireless sensor-actuator networks as a communication infrastructure for process control applications. An important enabling technology for industrial process control is WirelessHART, an open wireless sensor-actuator network standard specifically developed for process industries. A key challenge faced byWirelessHART networks is to meet the stringent real-time communication requirements imposed by feedback control systems in process industries. Fixed priority scheduling, a popular scheduling policy in real-time networks, has recently been shown to be an effective real-time transmission scheduling policy in WirelessHART networks. Priority assignment has a major impact on the schedulability of real-time flows in these networks. This paper investigates the open problem of priority assignment for periodic real-time flows for feedback control loops closed through a WirelessHART network. We first propose an optimal priority assignment algorithm based on branch and bound for any given worst case delay analysis. We then propose an efficient heuristic search algorithm for priority assignment. We also identify special cases where the heuristic search is optimal. Simulations based on random networks and the real topology of a physical sensor network testbed showed that the heuristic search algorithm achieved near optimal performance in terms of schedulability, while significantly outperforming traditional real-time priority assignment policies.
On Motion Parameterizations in Image Sequences from Fixed ViewpointsUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Manfred Georg
10/28/2010 Reports/Attachments/936/diss.pdf
This dissertation addresses the problem of parameterizing object motion within a set of images taken with a stationary camera.  We develop data-driven methods across all image scales: characterizing motion observed at the scale of individual pixels, along extended structures such as roads, and whole image deformations such as lungs deforming over time.  The primary contributions include (a) fundamental studies of the relationship between spatio-temporal image derivatives accumulated at a pixel, and the object motions at that pixel, (b) data driven approaches to parameterize breath motion and reconstruct lung CT data volumes, and (c) defining and offering initial results for a new class of Partially Unsupervised Manifold Learning (PUML) problems, which often arise in medical imagery.

Specifically, we create energy functions for measuring how consistent a given velocity vector is with observed spatio-temporal image derivatives.  These energy functions are used to fit parametric snake models to roads using velocity constraints.  We create an automatic data-driven technique for finding the breath phase of lung CT scans which is able to replace external belt measurements currently in use clinically.  This approach is extended to automatically create a full deformation model of a CT lung volume during breathing or heart MRI during breathing and heartbeat.  Additionally, motivated by real use cases, we address a scenario in which a dataset is collected along with meta-data which describes some, but not all, aspects of the dataset.  We create an embedding which displays the remaining variability in a dataset after accounting for variability related to the meta-data.
Dissertation of Manfred Georg

Robert Pless, Chair
Guy Genin
Cindy Grimm
Tao Ju
William Richard
John Shareshian
Optimal Design-space Exploration of Streaming ApplicationsUse SHIFT+ENTER to open the menu (new window).
Shobana Padmanabhan, Yixin Chen, and Roger D. Chamberlain
10/19/2010 Reports/Attachments/935/spadmanabhan_ip11.pdf
Many embedded and scientific applications are frequently pipelined asynchronously and deployed on architecturally diverse systems to meet performance requirements and resource constraints. We call such pipelined applications streaming applications. Typically, there are several design
parameters in the algorithms and architectures used that, when customized, impact the tradeoff between different metrics of application performance as well as resource utilization. Automatic exploration of this design space is the
goal of this research.

When using architecturally diverse systems to accelerate streaming applications, the design search space is often complex. We present a global optimization framework comprising a novel domain-specific variation of
branch-and-bound that reduces search complexity by exploiting the topology of the application's pipelining. We exploit the topological information to discover decomposability through the canonical Jordan block form and to originate two novel decomposition techniques that we name linear chaining and convex decomposition. The reduction in search complexity that we achieve for two prototypical real-world streaming applications is significant. For one application the size of the search space reduces from about 10^18 to
10^12, a million-fold reduction, and for the second application, the reduction is a factor of at least 20~million.
The Design and Implementation of MCFlow: a Real-time Multi-core Aware Middleware for Dependent Task GraphsUse SHIFT+ENTER to open the menu (new window).
Huang-Ming Huang, Christopher Gill, Chengyang Lu
10/14/2010 Reports/Attachments/934/mcflow1_a.pdf
Modern computer architectures have evolved from uni-processor platforms to multi-processor and multi-core plat- forms, but traditional real-time distributed middleware such as RT-CORBA has not kept pace with that evolution.
To address those issues, this paper describes the design and implementation of MCFlow, a new real-time distributed middleware for dependent task graphs running on multi-core platforms. MCFlow provides the following contributions to the state of the art in real-time middleware: (1) it provides an efficient C++ based component model through which computations can be configured flexibly for execution within a single core, across cores of a common host, or spanning multiple hosts; (2) it allows optimizations for inter-component communication to avoid data copying without sacrificing the parallel executability of data dependent tasks; (3) it strictly separates timing and functional concerns of an application so that they can evolve and can be configured independently; and (4) it provides a novel event dispatching architecture that uses lock free algorithms to avoid mutex locking and reduce memory contention, CPU context switching, and priority inversion. We also present an empirical evaluation that demonstrates the efficacy of our approach.
{hh1, cdgill, lu}
RT-Xen: Real-Time Virtualization Based on Fixed-Priority Hierarchical SchedulingUse SHIFT+ENTER to open the menu (new window).
Sisu Xi, Justin Wilson, Chenyang Lu, Christopher Gill
10/11/2010 Reports/Attachments/931/ecrts11_rtxen.pdf
The ability to build systems of systems is vital to the real-time community as we endeavor to build increasingly sophisticated systems. Subsystems must be modular and isolated to ensure desired real-time performance when assembled by the system integrator. While virtualization has gained broad adoption and success for system integration in non-real-time domains, no existing virtual machine manager is designed to enforce real-time performance. On the other hand, while a rich body of promising theories on hierarchical real-time scheduling algorithms has been proposed in the literature to support composable real-time performance guarantees, they have not been incorporated and evaluated in real virtualized systems. We have developed RT-Xen, the first virtual machine manager designed to support real-time domains through hierarchical real-time scheduling. RT-Xen bridges the gap between virtualization and hierarchical scheduling by introducing a hierarchical real-time scheduling framework in Xen, the most widely used open-source virtual machine manager. Experimental evaluation revealed important insights on the feasibility and efficacy real-time virtualization:
(1) RT-Xen can effective provide virtualization and scheduling to Linux guest OSes at 1~ms scheduling resolution (quantum) while incurring moderate overhead;
(2) RT-Xen can provide real-time performance isolation against overrun domains;
(3) RT-Xen significantly outperformed the standard credit-based Xen scheduler in term of real-time performance.
RT-Xen represents a promising step toward real-time virtualization for system integration of real-time applications.
End-to-End Delay Analysis for Fixed Priority Scheduling in WirelessHART NetworksUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah, You Xu, Chenyang Lu, and Yixin Chen
10/11/2010 Reports/Attachments/930/WH_SCHEDULABILITY.pdf
The WirelessHART standard has been specifically designed for real-time communication between sensor and actuator devices for industrial process monitoring and control. End-to-end communication delay analysis for WirelessHART networks is required for acceptance test of real-time data flows from sensors to actuators and for workload adjustment in response to network dynamics. In this paper, we map the scheduling of real-time periodic data flows in a WirelessHART network to real-time multiprocessor scheduling. We, then, exploit the response time analysis for multiprocessor scheduling and propose a novel method for the end-to-end delay analysis of the real-time flows that are scheduled using a fixed priority scheduling policy in a WirelessHART network. Simulations based on both random topologies and real network topologies of a physical testbed demonstrate the efficacy of our end-to-end delay analysis in terms of acceptance ratio under various fixed priority scheduling policies.
ARCH: Practical Channel Hopping for Reliable Home-Area Sensor NetworksUse SHIFT+ENTER to open the menu (new window).
Mo Sha, Gregory Hackmann, Chenyang Lu
10/9/2010 Reports/Attachments/929/ARCH.pdf
Home area networks (HANs) promise to enable sophisticated home automation applications such as smart energy usage and assisted living. However, recent empirical study of HAN reliability in real-world residential environments revealed significant challenges to achieving reliable performance in the face of significant and variable interference from a multitude of coexisting wireless devices. We propose the Adaptive and Robust Channel Hopping (ARCH) protocol: a lightweight receiveroriented protocol which handles the dynamics of residential environments by reactively channel hopping when channel conditions have degraded. ARCH has several key features. First, ARCH is an adaptive protocol that channel-hops based on changes in channel quality observed in real time. Second, ARCH is a distributed protocol that selects channels on a per-link basis, due to the large link-to-link variations in channel quality observed under empirical study. Third, ARCH is designed to be robust and lightweight. ARCH uses a practical hand-shaking approach to handle channel desynchronization and an efficient slidingwindow scheme that does not involve expensive calculations or modeling, and can be reasonably implemented on memoryconstrained wireless sensor platforms. Fourth, ARCH introduces minimal communication overhead for applications where packet acknowledgements are already enabled. We evaluate our approach through real deployment in real-life apartments with residents’ daily activity. Our results demonstrate that ARCH can reduce the number of packet retransmissions by a median of 42.3% compared to using a single, fixed wireless channel, and can enable up to a 2.2 improvement in delivery rate on the most unreliable links in our experiment. Under a multi-hop routing scenario, ARCH achieved an average 31.6% reduction in radio usage, by reducing the ETX along each path by up to 83.6%. Due to ARCH’s lightweight reactive design, most links achieve this improvement in reliability with 10 or fewer channel hops per day.
Abstractions and Algorithms for Control of Extensible and Heterogeneous Virtualized Network InfrastructuresUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Charles Wiseman
9/23/2010 Reports/Attachments/928/dissertation.pdf
Virtualized network infrastructures are currently deployed in both research and commercial contexts. The complexity of the virtualization layer varies greatly in different deployments, ranging from cloud computing environments, to carrier Ethernet applications using stacked VLANs, to networking testbeds.  In all of these cases, many users are sharing the resources of one provider and each user expects their resources to be isolated from all other users.  There are many challenges associated with the control and management of these systems, including resource allocation and sharing, resource isolation, system security, and usability.  Among the different types of virtualized infrastructures, network testbeds are of particular interest due to their widespread use in education and in the networking research community.

Networking researchers rely extensively on testbeds when evaluating new protocols and ideas.  Indeed, a substantial percentage of top research papers include results gathered from testbeds.  Network emulation testbeds in particular are often used to conduct innovative research because they allow users to emulate diverse network topologies in a controlled environment.  That is, researchers run experiments with a collection of resources that can be reconfigured to represent many different network scenarios.  The user typically has control over most of the resources in their experiment which results in a high level of reproducibility.  As such, these types of testbeds provide an excellent bridge between simulation and deployment of new ideas.  Unfortunately, most testbeds suffer from a general lack of resource extensibility and diversity.

This dissertation extends the current state of the art by designing a new, more general testbed infrastructure that expands and enhances the capabilities of modern testbeds.  This includes pertinent abstractions, software design, and related algorithms.  The design has also been prototyped in the form of the Open Network Laboratory network testbed, which has been successfully used in educational and research pursuits.  While the focus is on network testbeds, the results of this research will also be applicable to the broader class of virtualized system infrastructures.
Multi-Channel Reliability and Spectrum Usage in Real Homes: Empirical Studies for Home-Area Sensor NetworksUse SHIFT+ENTER to open the menu (new window).
Mo Sha, Gregory Hackmann, Chenyang Lu
7/30/2010 Reports/Attachments/926/infocom11.pdf
Home area networks (HANs) consisting of wireless sensors have emerged as the enabling technology for important applications such as smart energy and assisted living. A key challenge faced by HANs is maintaining reliable operation in real-world residential environments. This paper presents two in-depth empirical studies on the wireless channels in real homes. The spectrum study analyzes the spectrum usage in the 2.4 GHz band where wireless sensor networks based on the IEEE 802.15.4 standard must coexist with existing wireless devices. We characterize the ambient wireless environment in six apartments through passive spectrum analysis across the entire 2.4 GHz band over seven days in each of the apartments. Notably, we find that the wireless conditions in these residential environments can be much more complex and varied than in a typical office environment. Moreover, while 802.11 signals play a significant role in spectrum usage, there also exist non-negligible noise from non-802.11 devices. The multi-channel link study measures the reliability of different 802.15.4 channels through active probing with motes. We discover that there is not always a persistently reliable channel over 24 hours; that reliability is strongly correlated across adjacent channels; and that link reliability does not exhibit cyclic behavior at daily or weekly timescales. Nevertheless, reliability can be maintained through a small number of channel hops per day, suggesting channel diversity as a key tool for designing robust HANs in residential environments. Our empirical studies provide important guidelines and insights for robust wireless sensor network design in residential environments.
Multi-Tier Diversified Service Architecture for Internet 3.0: The Next Generation InternetUse SHIFT+ENTER to open the menu (new window).
Subharthi Paul, Raj Jain, Jianli Pan, Chakchai So-in
6/23/2010 Reports/Attachments/925/Jain_techreport.pdf
The next generation Internet needs to support multiple diverse application contexts. In this paper, we present Internet 3.0, a diversified, multi-tier architecture for the next generation Internet. Unlike the current Internet, Internet 3.0 defines a new set of primitives that allows diverse applications to compose and optimize their specific contexts over resources belonging to multiple ownerships. The key design philosophy is to enable diversity through explicit representation, negotiation and enforcement of policies at the granularity of network infrastructure, compute resources, data and users. The basis of the Internet 3.0 architecture is a generalized three-tier object model. The bottom tier consists of a high-speed network infrastructure. The second tier consists of compute resources or hosts. The third tier consists of data and users. The “tiered” organization of the entities in the object model depicts the natural dependency relationship between these entities in a communication context. All communication contexts, including the current Internet, may be represented as special cases within this generalized three-tier object model. The key contribution of this paper is a formal architectural representation of the Internet 3.0 architecture over the key primitive of the “Object Abstraction” and a detailed discussion of the various design aspects of the architecture, including the design of the “Context Router-” the key architectural element that powers an evolutionary deployment plan for the clean slate design ideas of Internet 3.0.
{spaul, jain,jp10,cs5}
Against All Probabilities: A modeling paradigm for streaming applications that goes against common notionsUse SHIFT+ENTER to open the menu (new window).
Rahav Dor
6/23/2010 Reports/Attachments/924/RahavDor_thesis.pdf
Hardware and software design requires the right portion of skills and mental faculties. The design of a good system is an exercise in rational thinking, engineering, and art. The design process is further complicated when we aspire to build systems that exploit parallelism or are targeted to be deployed on architecturally diverse computing devices, FPGAs or GPUs to name just a few.

The need to develop systems that can take advantage of computing devices beyond general purpose CPUs is real. There are several application domains and research efforts that will simply not be able to adequately perform or yield answers in a reasonable amount of time otherwise. Developing a mathematical model of a system is a key stepping stone to a high performance system, but often is absent from the design process due to the complexity of the model development.

In this paper we offer an easy, yet solid approach to the development of such mathematical models. This adds a little bit more weight to engineering side of the design process in the form of a quantifiable method that enables designers to reason about their systems, identify bottlenecks, and gain vital information for performance improvements.
Cloud Computing for Scalable Planning by Stochastic SearchUse SHIFT+ENTER to open the menu (new window).
Qiang Lu and You Xu and Ruoyun Huang and Yixin Chen
6/12/2010 Reports/Attachments/923/parallel-mrw.pdf
Graph search has been employed by
many AI techniques and applications. A natural
way to improve the efficiency of search is to utilize ad-
vanced, more powerful computing platforms.  However, expensive computing infrastructures, such as supercomputers and large-scale
clusters, are traditionally available to only a limited
number of projects and researchers. As a results, most
AI applications, with access to only commodity com-
puters and clusters, cannot benefit from the efficiency
improvements of high-performance parallel search algorithms.
Cloud computing provides an attractive, highly
accessible alternative to other traditional high-
performance computing platforms. In this paper, we
first show that the run-time of our stochastic search algorithm in planning is a heavy-tailed distribution, which has a remarkable variability. Second, we propose an algorithm framework that
takes advantage of cloud computing.
A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local ApproximationUse SHIFT+ENTER to open the menu (new window).
Tom Erez and William D. Smart
5/27/2010 Reports/Attachments/922/uai10_1.pdf
Partially-Observable Markov Decision Processes (POMDPs) are typically solved by finding an approximate global solution to a corresponding belief-MDP.
In this paper, we offer a new planning algorithm for POMDPs with continuous state, action and observation spaces.  Since such domains have an inherent notion of locality, we can find an approximate solution using local optimization methods.  We parameterize the belief distribution as a Gaussian mixture, and use the Extended Kalman Filter (EKF) to approximate the belief update.  Since the EKF is a first-order filter, we can marginalize over the observations analytically. By using feedback control and state estimation during policy execution, we recover a behavior that is effectively conditioned on incoming observations despite the unconditioned planning.
Local optimization provides no guarantees of global optimality, but it allows us to tackle domains that are at least an order of magnitude larger than the current state-of-the-art.  We demonstrate the scalability of our algorithm by considering a simulated  hand-eye coordination domain with 16 continuous state dimensions and 6 continuous action dimensions.
Optimal Time Utility Based Scheduling Policy Design for Cyber-Physical SystemsUse SHIFT+ENTER to open the menu (new window).
Terry Tidwell, Robert Glaubius, Christopher D. Gill and William D. Smart
5/25/2010 Reports/Attachments/921/rtssSubmission1.pdf
Classical scheduling abstractions such as deadlines and priorities do not readily capture the complex timing semantics found in many real-time cyber-physical systems. Time utility functions provide a necessarily richer description of timing semantics, but designing utility-aware scheduling policies using them is an open research problem. In particular, optimal utility accrual scheduling design is needed for real-time cyber-physical domains.

In this paper we design optimal utility accrual scheduling policies for cyber-physical systems with periodic, non-preemptable tasks that run with stochastic duration. These policies are derived by solving a Markov Decision Process formulation of the scheduling problem. We use this formulation to demonstrate that our technique improves on existing heuristic utility accrual scheduling policies.
Real-Time Scheduling for WirelessHART NetworksUse SHIFT+ENTER to open the menu (new window).
Abusayeed Saifullah, Chenyang Lu, You Xu, and Yixin Chen
5/14/2010 Reports/Attachments/920/RTSS_2010_WH.pdf
WirelessHART is an open wireless sensor-actuator network standard for industrial process monitoring and control that requires real-time data communication between sensor and actuator devices. Salient features of a WirelessHART network include a centralized network management architecture, multi-channel TDMA transmission, redundant routes, and avoidance of spatial reuse of channels for enhanced reliability and real-time performance. This paper makes several key contributions to real-time transmission scheduling in WirelessHART networks: (1) formulation of the end-to-end real-time transmission scheduling problem based on the characteristics of WirelessHART; (2) proof of NP-hardness of the problem; (3) an optimal branch-and-bound scheduling algorithm based on a necessary condition for schedulability; and (4) an efficient and practical heuristic-based scheduling algorithm called Conflict-aware Least Laxity First (C-LLF). Extensive simulations based on both random topologies and real network topologies of a physical testbed demonstrate that C-LLF is highly effective in meeting end-to-end deadlines in WirelessHART networks, and significantly outperforms common real-time scheduling policies.
An Environment for Stroke Therapy Game AuthoringUse SHIFT+ENTER to open the menu (new window).
MS Thesis
Simon Tam
5/12/2010 Reports/Attachments/919/Tam_thesis.pdf
Stroke is the leading cause of disability in the industrialized world. The disabilities caused by stroke can significantly impact the quality of daily life for stroke survivors. Studies show that a high number of daily body movement repetitions can help recover lost motor functionality. However, many stroke patients do not perform the home exercises needed to reach this high number of repetitions. Games for rehabilitation have been shown to provide motivation to perform home exercises. However, because stroke can leave survivors with different levels of disability, custom games are required. Creating custom games, though, is time consuming and requires advanced programming knowledge. A tool for quick and easy game development is a solution to this problem. In this master’s project, I collaborated with a doctoral occupational therapy student to take steps towards developing a game authoring environment custom to therapists.
Towards Real-time Wireless Sensor NetworksUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Octav Chipara
5/11/2010 Reports/Attachments/918/chipara_final.pdf
Wireless sensor networks are poised to change the way computer systems interact with the physical world. We plan on entrusting sensor systems to collect medical data from patients, monitor the safety of our infrastructure, and control manufacturing processes in our factories. To date, the focus of the sensor network community has been on developing best-effort services. This approach is insufficient for many applications since it does not enable developers to determine if a system's requirements in terms of communication latency, bandwidth utilization, reliability, or energy consumption are met. The focus of this thesis is to develop real-time network support for such critical applications.
The main theoretical contribution of this thesis is the development of novel transmission scheduling techniques optimized for data collection applications. This work bridges the gap between wireless sensor networks and real-time scheduling theory, which have traditionally been applied to processor scheduling.  The thesis also focuses on the development of a real-time patient monitoring system for general hospital units. We show that it is feasible to use wireless sensor networks for clinical monitoring through a study in a step-down cardiac care unit at Barnes-Jewish Hospital.
Split and Merge Functions for Supporting Multiple Processing Pipelines in Mercury BLASTNUse SHIFT+ENTER to open the menu (new window).
Jwalant Ahir; Jeremy Buhler; Roger D. Chamberlain
5/11/2010 Reports/Attachments/917/ahir_et_al1.pdf
Biosequence similarity search is an important application in computational biology. Mercury BLASTN, an FPGA-based implementation of BLAST for DNA, is one of the alternatives for fast DNA sequence comparison. The re-design of BLAST into a streaming application combined with a high-throughput hardware pipeline have enabled Mercury BLAST to emerge as one of the fastest implementations of bio-sequence similarity search. This performance can be further enhanced by exploiting the data-level parallelism present within the application. Here we present a multiple FPGA-based Mercury BLASTN design in order to double the speed and throughput of DNA sequence computation. This paper describes a dual Mercury BLASTN design, the detailed design of the split and merge functions, and simulation results.
Sorting as a Streaming Application Executing on Chip MultiprocessorsUse SHIFT+ENTER to open the menu (new window).
Roger D. Chamberlain; Greg A. Galloway; Mark A. Franklin
5/7/2010 Reports/Attachments/916/Chamberlain_etal.pdf
Expressing concurrency in applications has always been a difficult and error-prone endeavor, yet effective utilization of multi-core processors requires that the concurrency in applications be understood.  One approach to the expression of concurrency is streaming, which has shown real promise as a safe and effective method for many application classes.  Here, we express a classic problem, sorting, in the streaming paradigm and explore the implications of various algorithm and architectural design parameters on the performance of the application.
Discovering, Localizing, Calibrating, and Using Thousands of Outdoor WebcamsUse SHIFT+ENTER to open the menu (new window).
PhD Dissertation
Nathan Jacobs
5/4/2010 Reports/Attachments/915/dissertation_jacobs.pdf
The web has an enormous collection of live cameras that capture images of roads, beaches, cities, mountains, buildings, and forests.  With an appropriate foundation, this massively distributed, scalable, and already existing network of cameras could be used as a new global sensor.  The sheer number of cameras and their broad spatial distribution prompts new questions in computer vision: How can you automatically geo-locate each camera? How can you learn the 3D structure of the scene? How can you correct the color measurements from the camera? What environmental properties can you extract?

We address these questions using models of the geometry and statistics of natural outdoor scenes, and with an understanding of how image appearance varies due to transient objects, the weather, the time of day, and the season. We show algorithms to calibrate cameras and annotate scenes that formalize this understanding in classical linear and nonlinear optimization frameworks.  Our work uses images captured at long temporal scales, ranging from minutes to years; often these images are from a database of images we created by capturing  images from 1000 webcams every half hour.  By exploring natural cues capable of working over such long time scales on a broad range of scenes, we deepen our understanding of the interplay between geographic location, time, and the causes of image appearance change.
1 - 100 Next